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PREFACE. 



The following elementary treatment of the Method of Least 
Squares has grown out of my attempts to so present the sub- 
ject to students of physics, astronomy, and engineering, that 
a working knowledge based upon an appreciation of its prin- 
ciples might be acquired with a moderate expenditure of time 
and labor. 

Conceiving that the ultimate warrant for the legitimacy of 
the method itself is to be found in the agreement between the 
observed distribution of residuals and the distribution repre- 
sented by the error curve, I have not scrupled to abandon 
altogether the analytical demonstrations of the equation of 
this curve and to present it as an empirical formula^ represent- 
ing the generalized experience of observers. The evidence in 
support of a formula of this kind is necessarily cumulative, 
and the few curves which are presented in illustration of the 
law of error are to be considered as samples of the kind of 
evidence which exists in great abundance. By abandoning 
the theoretical demonstrations, the student is freed from the 
embarrassments which are usually encountered at the thresh- 
old of the subject, and which in many cases cause it to appear 
as a mathematical puzzle whose analytical difficulties absorb 
the attention of the tyro to the complete exclusion of the pur- 
poses for which the analysis is conducted. 

I have sought to give prominence to the distinction between 
accidental and systematic errors, and to insist upon the limi- 
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tations which result from the difference between these two 
classes of error. To illustrate the principles of the text, I 
have made free use of numerical data and have arranged the 
computations in forms which experience has shown to be 
convenient for the purpose, with a view to their subsequent 
use by the student as models for his owu computations. 

In the preparation of these pages, I have consulted many, 
if not most, of the standard treatises upon the subject, but 
my indebtedness for suggestions and methods of treatment 
is principally to 

Faye, Cmirs W Astronomie de VEcole Polytechnique, 

Oppolzer, Lehrbuch der Bahnbestimmung. 

Wright, Treatise on the Adjustment of Observations, 

G. C. C. 
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§ 1. Problem. To determine the coefficient of linear expan- 
sion of a certain bar of metal its length was determined at 
different temperatures by comparison with a standard of known 
length. The data furnished by the measures are (KoMrausch, 
Leitfaden der PhysikJ : 



Temperature. 


Observed Lieng^h. 




mm. 


20° C. 


1000.22 


40 


1000.65 


60 


1000.90 


60 


1001.06 



It is required to determine from these observations the amount 
of the expansion of the bar per degree Centigrade. 

If c denote the required expansion, and Iq the length of the 
bar when its temperature is 0** C, its length, I, at any other 
temperature, t, may be represented by the equation 

By means of this equation the four observations recorded 
above are transformed into the following obset'vcUion equations : 



(1) Zo + 20 c = 1000.22^ 

(2) Zo-h40c = 1000.65 

(3) Zo + 50 c = 1000.90 

(4) Zo + 60 c = 1001.05 



> (1) 



y 



Any two of these equations are sufficient to determine the 
values of l^ and c, but the values derived from different pairs 
of equations will be different. Thus we may find from 
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Equations. Iq c 

mm. mm. 

(1) and (2)' 999.79 + 0.0215 

(1) and (4) 999.80 .0208 

(2) and (3) 999.66 .0250 
(8) and (4) 1000.16 .0160 

etc. etc. etc. 

We are here presented with a problem of constant recur- 
rence in the investigations and applications ctf physical science. 
In order to determine the values of certain quantities with a 
high degree of precision more measures or observations are 
made than are absolutely necessary, and these observations 
prove to be inconsistent among themselves, so that the result- 
ing values of the unknown quantities depend upon the manner 
in which the data are combined. It is evident that all of the 
values above found for Iq_ and c cannot be correct, and it is 
doubtful if any absolutely correct value can be derived from 
the data ; but it is also apparent that the observations are not 
worthless and that any of the values above derived may be 
considered as approximations, more or less close, to the true 
values of the required quantities. 

If we assume that the relation between the length of a bar 
and its temperature can be expressed by an equation of the 
form employed above, we must suppose that the discordances 
in the results are due to errors in the observations, and the 
problem then becomes : 

To find from the observed data a set of results which shall 
be affected as little as possible by the errors of the data, or in 
more technical language, to find the most probable values of 
the unknown quantities. 

We may establish in advance of any formal investigation of 
this problem certain principles to which its solution must con- 
form. Thus, 

(A) The adopted values of the quantities which are to be 
determined must be based upon all the data available. Only 
in exceptional cases, which will be considered hereafter, is it 
proper to omit or reject any observation or any known rela- 
tion among the quantities. 
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(B) The adopted values must satisfy the observation equa- 
tions as nearly as possible. 

§ 2. Errors and Eesiduals. The expression error of an 
observation has been freely used in the preceding section, but 
it should be recognized that the amount of this error can 
rarely, if ever, be known, since this would imply an exact 
knowledge of the unknown quantities. We may, liowever, 
obtain approximate values of these errors frcim the adopted 
values of the quantities which were to be determined. Thus, 
if the values l^ = 999.79, c = + 0.0215 be substituted iii equa- 
tions (1), these become 

(1) 1000.22 = 1000.22 (3) 1000.87 = 1000.90 

(2) 1000.65 = 1000.65 (4) 1001.08 = 1001.05 

The difference between the first and second members of any 
one of these equations is called the residual of that equation, 
and is approximately the error of the corresponding observa- 
tion. The residuals, v, which correspond to the several values 
of Iq and c derived in § 1 are given below in tabular form. 



^0= 999.79 


999.80 

1 


999.65 


1000.15 


c = + 0.0215 


+ 0.0208 


4- 0.0250 


+ 0.0150 


V - 0.00 


0.00 


+ 0.07 


-0.23 


.00 


+ .02 


.00 


- .10 


- .03 


+ .06 


.00 


.00 


+ .03 


.00 


- .10 


.00 



We may thus, for any assumed values of the unknown quan- 
tities, find a corresponding set of residuals, and the smaller 
these residuals are, the closer is the probable approximation of 
the assumed, to the true values. Principle (B). 

This statement, however, requires an important qualification 
to which we now proceed. 

The errors with which any given series of observations is 
affected may be divided into two classes : 

Accidental Errors, or those whose law of recurrence is such 
that in the long run they are as often positive as negative and 
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whose effect upon the mean of a great number of observations 
therefore differs but little from zero ; and 

Systematic Errors, or those which in the given series of 
observations do not thus tend to be eliminated from the mean. 
In the observations considered in § 1, an error of judgment by 
which the observer in a given case read the thermometer 0°.l 
too high would probably be an accidental error, since it may 
be presumed that in the long run he would read it as often too 
low as too high, but if through a fixed habit of observing, the 
thermometer were always read too high this would be a sys- 
tematic error, and the number of observations might be indefi- 
nitely increased without in the least diminishing its effect. 

If the standard of length with which the bar was compared 
were an erroneous standard {e,g, 0.01 mm. too long), all of the 
observations would be affected with a systematic error due to 
this source, and the residuals would furnish no trace of this 
error, since they show only discordances among the observa- 
tions, and not errors affecting all alike. The smallness of the 
residuals in any case, therefore, furnishes no guaranty that the 
observations and the results derived from them have not been 
vitiated by systematic errors. 

The presence of errors of this class constitutes the greatest 
obstacle to the accurate determination of any set of quantities 
whose values are sought, and the ingenuity and skill of the 
observer or experimenter cannot be better employed than in 
avoiding or overcoming the effect of such errors. It therefore 
deserves especial notice that systematic errors can often be 
transformed into accidental errors by varying the methods of 
observation or the conditions under which the observations are 
made. Thus the possible systematic error of judgment in 
reading a thermometer, to which allusion was made above, may 
be transformed into an accidental error if several different per- 
sons take part in the observations, since it is hardly probable 
that they will all have a common, persistent error of judgment. 
The error due to using an erroneous standard of length may 
be changed into accidental error by employing a number of 
different standards, since it is not probable that these, con- 
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structed at different times and by different makers, will all 
have a common error of length. Considerations of this char- 
acter serve to illustrate the great practical importance of vary- 
ing the methods of determining any quantity whose value is 
desired with great precision. Multiplying observations by the 
same method and under similar circumstances serves only to 
diminish the effect of accidental errors and is useless beyond a 
certain limit, while varying the methods and the circumstaifces 
under which observations are made tends to eliminate errors 
of both kinds. 

The principles here considered find their appropriate appli- 
cation in the selection of the methods by which any given set 
of unknown quantities is to be determined, but after the obser- 
vations have been made, since they can, in general, furnish but 
little, if any, information in regard to their own systematic 
errors, these must be neglected and the reduction and discus- 
sion of the observations directed toward eliminating the effect 
of the accidental errors. 

§ 3. The Distributioii of Besidualg. Oauss, a German mathe- 
matician, has shown by a course of analysis based upon the 
theory of probabilities that in any given series comprising a 
very large number of observations affected with accidental 
errors, the number of errors of a given magnitude, x, is a func- 
tion of that magnitude. Thus, if x' and x" denote any two 
errors, and y' and y" the number of observations having the 
errors x' and a?" respectively, then 

y':y'^::f(x'):f(x'^) 
The analytical expression for f{x) obtained by Gauss is 

Ax) = -^ e-*>n (2) 

where e = base of the Naperian system of logarithms, 

TT = ratio of the circumference to the diameter of a circle, 
h = s, number whose value must be derived for each series 
of observations, but is constant for all the obser- 
vations of that series.* 
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The same expression for f{x) has been derived by other 
mathematicians through different courses of analysis, but 
against all of these investigations objections of a theoretical 
character have been urged. Experience, however, shows that 
the actual distribution of residuals does follow this law, not 
with absolute accuracy, but to a remarkable degree of approxi- 
mation. An excellent illustration of this distribution in the 
case of a comparatively small number of observations is 
afforded by a series of 66 determinations of the velocity of 
light made at Washington, in the year 1882.* By means of a 
revolving mirror the time required for the passage of a ray of 
sunlight from one terrestrial point to another was measured. 
The mean of the 66 determinations of this time interval was 
24.827 million ths of a second. By subtracting this mean from 
each single determination a series of residuals will be obtained, 
and the number of residuals whose magnitude equals 1, 2, 3, 
etc. units may then be counted. In this way a fair approxi- 
mation to the distribution of residuals represented by Gauss's 
law of error will be found; but as this law purports to repre- 
sent the average distribution of a great number of errors, we 
shall obtain a better comparison between it and the actual dis- 
tribution by the following device, to which we resort in order 
to increase the number of available residuals : 

Let it be assumed that in any given set of observations the 
number of residuals of magnitude x is proportional to the num- 
ber of residuals occurring between the limits x — a and x-\- a, 
where a is a quantity which in strictness ought to be an infini- 
tesimal, but which may be made a small finite quantity without 
appreciable error. In the present case we adopt as the unit in 
which the residuals are to be expressed, the thousand-millionth 
part of a second (O'.OOOOOOOOl), and put a equal to two such 
units. Thus, from a series of ^% observations are derived the 
following numbers which represent the distribution of resid- 
uals which might be expected to occur in a much longer series. 

* Velocity of Light in Air and Refracting Media. Bureau of Navi- 
gation, Navy Department, 1886, p. 187. 
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Residual. 


No. 


% 


» 


Residaal. 


No. 


% 


Less than — 13.5 


2 


0.8 


Greater than +13.6 





0.0 


Equal to - 13.5 





0.0 


Equal to 


+ 13.5 


2 


0.8 


-12 5 


2 


0.8 




+ 12.5 


2 


0.8 


-11.5 


2 


0.8 




+ 11.6 


3 


1.2 


-10.5 


2 


0.8 




+ 10.5 


6 


2.3 


- 9.6 


8 


1.2 




+ 9.5 


5 


1.9 


- 8.6 


2 


0.8 




+ 8.5 


6 


2.3 


- 7.5 


4 


1.6 




+ 7.5 


7 

• 


2.7 


- 6.6 


6 


2.3 




+ 6.6 


8 


3.1 


- 6.6 


8 


3.1 




+ 5.5 


10 


3.9 


- 4.6 


12 


4.7 




+ 4.5 


12 


4.7 


- 3.5 


15 


5.8 




+ 3.5 


15 


5.8 


- 2.6 


18 


7.0 




+ 2.5 


17 


6.6 



- 1.6 21 8.2 + 1..5 21 8.2 

- 0.6 23 8.9 + 0.5 23 8.9 

The column headed % represents the number of residuals 
differing not more than half a unit from the magnitude given 
in the first column, expressed as a percentage of the whole 
number of residuals. Fig. A furnishes a graphical representa- 
tion of this distribution, each percentage in the above table 
being represented by a -point whose abscissa is the magnitude 
of the residual and whose ordinate is the percentage itself. 
The curve whose equation is 

y = ^^e-^*^» A = 0.158 

is shown in the same figure, and a simple inspection of the 
curve shows that its ordinates represent very approximately 
the percentage of residuals of each magnitude. The coeffi- 
cient h appears multiplied by the factor 100 in order that the 
ordinates may be represented as percentages. 

Figs. B, C, D, represent the distribution of residuals in three 
other series of observations of different kinds, made at differ- 
ent places, by different observers, but all following the same 
law. The unit in which the residuals are expressed, unit of a?, 
is stated with each figure, and the unit of y is in every case 
one per cent of the whole number of residuals. 

The equations of the several curves shown in the figures are 
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almost identical, but the feature to which the student's atten- 
tion is called is that the algebraic form of the equation is in 
each case 

and not that h has approximately the same value in each 
curve. The numerical value of h depends upon the unit 
adopted for x, and these units having been chosen with refer- 
ence to a convenient graphical representation of the residuals, 
the agreement in the several values of h must be regarded as 
purely artificial. 

The series of observations represented in Fig. D is known 
to be affected with small systematic errors, and it will be 
noted that the distribution of the residuals is more irregular 
in this case than in any of the others. In each of the series 
represented in Figs. A and C, there are two residuals whose 
magnitudes are too great to be represented in the figures ; and 
it is quite generally found that the actual number of very large 
residuals is slightly greater than the number given by the 
error curve. The illustrations here given are typical cases, 
and may serve to exemplify the statement made at the begin- 
ning of this section, that the actual distribution of residuals is 
found to follow Gauss's law of error, and in the following sec- 
tions this law will be assumed as experimentally demonstrated, 
and from it will be derived the method of combining and dis- 
cussing observations. The student will find it an instructive 
exercise to treat in a manner similar to that pursued above 
any series of observations to which he may have access, par- 
ticularly his own observations, and thus lend additional weight 
to the experimental evidence which is here presented for his 
consideration. 

§ 4. The Error Curve. From the manner in which the ordi- 
nates of the points plotted in Figs. A, B, C, and D were derived, 
it will be apparent that these ordinates represent the number 
of residuals falling within certain chosen limits of error. Thus 
in Fig. A, 8.9 per cent of all the residuals lie between the 
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limits and +1, 8.2 per cent between -f 1 and +2, etc., the 
interval within which the residuals are enumerated being in 
every case one unit. It is also evident that the number of 
residuals falliug within any other interval, A a?, will depend 
upon the magnitude of this interval as well as upon the ordi- 
nate corresponding to it, and if Ax is taken sufficiently small 
the number of residuals will be proportional to the product 
y»Ax. Geometrically considered, this product is the area in- 
cluded between the axis of x, the curve, and the two ordinates 
drawn through the extremities of A x, and the number of resid- 
uals falling within the limits of A a; is therefore proportional 
to this area. We may, if we choose, make A a; an infinitesimal, 
and the area y- Ax and the corresponding number of residuals 
will then become indefinitely small, but by taking the sum of 
all the infinitesimal areas included between the limits x = a 
and a? = 6, where a and b have any values whatever, we obtain 
the area of that part of the curve included between ordinates 
drawn at these limits. By a similar process of summation we 
obtain the number of residuals lying between a and 6, and the 
number of residuals thus found must be proportional to the 
area, since this proportionality is true in every infinitesimal 
element included in the area. 

In the following table, the function, A, represents the area 
of that part of the error curve included between ordinates 
whose abscissas are and x, the argument of the table being 
the values of x for the particular error curve in whose equa- 
tion A = 1 ; but the area included between and x in the curve 
corresponding to any other value of h may be found from the 
same table, by using as the argument hx instead of x. 

The area of that part of the curve lying between the limits 
a and b is represented by 

A = Cydx = -^ fe-^'^'dx (3) 

Let the variable in this expression be changed by putting 
hx = ?, and the expression becomes 

A = ^\ e-*'dt 
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These expressions for A become identical if A = 1 ; hence, if 
the value of A be computed from the second integral for ^ = 1 
and tabulated, we may find from this table the value of A cor- 
responding to any other value of ^ by changing the limits a, 
6 into Jia and hh, 

A remarkable property of the curve, which will be of use 
hereafter, may be readily obtained from the expression here 
found for A. If we make a = and 6 = oo, the limits of the 
integral become and oo for all values of /i, hence the area of 
that part of the curve included between a; = and a; = oo is 
the same for all values of ^, i.e., for every series of observations. 



Table of Areas of the Error Curve between the 

Limits and lix. 



hx 


A 


Diff. 


hx 


A 


Diff. 


hx 


A 


l>iff. 


0.0 


0.000 


56 


1.0 


0.421 


19 


2.0 


0.49766 


85 


0.1 


.056 


55 


1.1 


.440 


15 


2.1 


.49851 


56 


0.2 


.111 


53 


1.2 


.455 


12 


2.2 


.49907 


36 


0.3 


.164 


50 


1.3 


.467 


9 


2.3 


.49943 


23 


0.4 


.214 


46 


1.4 


.476 


7 


2.4 


.49966 


14 


0.5 


.260 


42 


1.5 


.483 


5 


2.5 


.49980 


9 


0.6 


.302 


37 


1.6 


.488 


.4 


2.6 


.49989 


4 


0.7 


.339 


32 


1.7 


.492 


3 


2.7 


.49993 


3 


0.8 


.371 




1.8 


.495 




2.8 


.49996 








27 






1 






2 


09 


.398 


23 


1.9 


.496 


2 


2.9 


.49998 


1 


1.0 


.421 




2.0 


.498 




3.0 

• 

00 


.49999 
.50000 





If in any series of observations n] denote the number of 
residuals whose magnitudes are included between the limits a 
and b, n the whole number of residuals in the series, and A^, 
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Aj, the values of A obtained from the table with the arguments 
ha, Tibf then 

n']l = n{Aj,:fA,) 

since the ratio of n] ton is equal to the ratio of the area of 
that part of the error curve which lies between the limits a 
and b, to the area of the whole curve, and this latter area is 
seen from the table to be always unity. The — sign in this 
equation is to be used when a and b have like signs, and the + 
when they have unlike signs. If the percentage of residuals 
between the limits a and b is required, it may be found by 
substituting 100 in place of n as the coefficient of {Aj,':^Aa), 

Thus from Fig. A we find for the series of observations there 
represented, h^ = 0.025 and h = 0.158. 

To find the distribution of residuals between the limits 
— oo"*— 5, — 5»»«— 2, --2'"+l, -fl'»»-f4, -f-4"-+Q0, 
we proceed as follows : 



X 


hx 


^ 


-^bT^a 


~1« 


Per cent. 


Ob 


— 00 


— 00 


0.600 




= 66 (^5 T Ji 


a) 




-5 


-0.790 


.368 


0.132 


9 


13.2 


9 


-2 


-0.316 


.172 


.196 


13 


19.6 


11 


+ 1 
+ 4 

+ 00 


+ 0.158 
+ 0.632 

+ 00 


.088 
.314 
.500 


.260 
.226 

.186 


17 
15 
12 


26.0 
22.6 
18.6 


17 
13 
16 


1 ^^^ 


1 ^^ 






66 


100.0. 


66 



The numbers in the column "Per cent" may be compared 
with the percentages given on page 7. The column "Obs." 
gives the actual number of residuals which occur in the given 
series between the limits here considered, and these numbers 
should be compared with the column "n]^." 

By the use of this table, the distribution of residuals in any 
series of observations for which the value of h is known may 
be compared with the theoretical distribution much more 
readily than by plotting a curve, and the student should in 
this way examine several series of observations. The method 
of determining h for any given series is contained in § 12. 
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§ 5. The Frmciple of Least Squares. The quantity h which 
appears in the equation of the error curve deserves especial 
attention. If in the equation 

X be put equal to zero, the resulting value of y is — -» This is 

the maximum ordinate of the curve, and the value of this 
maximum ordinate varies directly as h. If those parts of the 
curve remote from the axis of y be considered, it will be found 
that the larger is h, the smaller are the values of y, since when 
a; is a large quantity er^^^ diminishes much more rapidly for 
increasing values of h than h itself increases. These relations 
between y and h correspond exactly to the criteria by which 
we estimate the precision of observations. If we compare two 
series of observations, I. and II., and find that in series I. the 
small errors are relatively more numerous (large values of y 
for small afs), and the large errors less numerous (small values 
of y for large a^'s), than in series II., we shall without hesita- 
tion call the observations of series I. more precise or accurate 
than those of series II. ; and if required to assign definite 
meanings to the terms " more precise " and " less precise,'' we 
shall find difficulty in defining them in any other manner than 
by reference to the magnitude of the residuals. We therefore 
adopt as the measure of precision of any series of observations 
the value of h in the equation of its error curve ; and having 
thus defined the term "precision," we are able to state two 
principles which are of general application in the discussion 
of observations. 

Let the data furnished by each observation be expressed in 
the form of an observation equation (Equations 1, § 1), then: 
the best attainable values of the unknown quantities are those 
which, 

(1) Distribute the residuals in accordance with the law of 
error, y = — - e-^^y and which. 
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(2) Make the value of h in the equation of the resulting 
error curve a maximum. 

The first of these principles is indeed involved in the sec- 
ond, since if the residuals are not distributed in accordance 
with the law 

A 



y = — — e-A*a?« 



there can be no value of fe to be made a maximum. It is, how- 
ever, advantageous to state (1) as a separate principle, since it 
affords a test of the presence of systematic errors in the data, 
which, though far from being a perfect criterion, is often con- 
venient and is sometimes the only test available. 

To justify the statement of (2), we resort to the following 
considerations : In accordance with A, § 1, we suppose that all 
of the data available is contained in the observation equations, 
and, B, § 1, we seek to satisfy all of these equations as nearly 
as possible. If the observations are free from systematic 
error, a supposition which must here be made, since we have 
no means of taking into account the effect of such errors, we 
may obtain by substituting in the observation equations any 
set of values which approximately satisfy them, a correspond- 
ing set of residuals which will be the errors of the observations, 
on the supposition that the substituted values were the true 
values of the unknown quantities. If these residuals are plot- 
ted • in an error curve, they will furnish a numerical measure 
of the precision A, assigned to the observations by this set of 
values, and out of all possible sets of values of the unknown 
quantities that set which assigns the maximum precision to 
the observations will be entitled to the greatest degree of con- 
fidence ; for if it were otherwise, we should have no reason for 
preferring a set of values which exactly satisfied all of the 
equations to a set which did not satisfy them. 

It is, of course, true that subsequent observations may fur- 
nish a better determination of the unknowns, and that the 
values thus found will not assign to the earlier observations as 
high a degree erf precision as did the erroneous values obtained 
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from these observations alone, but this subsequent determina- 
tion is based upon additional evidence, and the problem with 
which we are concerned is not to obtain the best possible val- 
ues of the unknown quantities, but the best values which can 
be derived from the data in our possession. 

Assuming, then, the validity of (2), we proceed to transform 
it into an expression more convenient for practical use, and for 
this purpose we resort to the following property of the error 
curve, which may be approximately verified by actual measure- 
ment from any. of the plotted curves, Figs. A, B, C, D. 

If the error curve be divided into a great number of parts 
by drawing equidistant ordinates throughout its whole extent, 
and the areas of the several parts into which the curve is thus 
divided be each multiplied by the square of the abscissa of its 

middle point, the sum of all these products will equal — -, 

^ ft 

The analytical expression for the process above described is 



x^ydx or — - I x^e-^^^*dx 



Put ?ix = tf and this integral" becomes 

For the method of obtaining the value of the last integral, 
see Newcomh^s Calculus^ Articles 169, 176. 

The area of each of the parts into which the curve was 
divided is proportional to the number of residuals occurring 
between the limiting ordinates of the part ; thus, let A denote 
the area of the part, N the corresponding number of residuals, 
and n and a the whole number of residuals and the whole area 
of the curve respectively ; then 

A: N :: a:n 

but from the table in § 4, a = 1, whence 

^=- and Aa^ = — 
n n 



r 
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Since N denotes the number of cc^s falling within the given 
infinitesimal part, A, of the curve, Na^ is equal to the sum 
of the squares of ibhe os^s (residuals) whose magnitudes fall 
between the limiting ordinates of A, and taking the sum of all 
the -4a^'s we obtain ^ 

n 

i.e., is equal to the mean of the squares of all the residuals. 
It is customary to represent the sum of the squares of the 
residuals by the symbol [vv], v standing for any residual, and 
the [ ] denoting the sum of all quantities of the kind written 
within them. 

Comparing this result with the one obtained above, we have 

n 2h' ^ ' 

from which it appears that the relation between li and the 
sum of the squares of the residuals is such that when h is 
a maximum, \yv\ is a minimum, and principle (2) may be 
restated as follows ; 

The most probable values of the unknown quantities are those 
which make the sum oftJie squares of the residuals a minimum. 

From this principle has been derived the name Method of 
Least Squares, which is commonly applied to that body of 
principles which treats of the combination and discussion 
of observed data. 

We have arrived at this principle from a consideration of 
that class of cases in which the quantity observed is a func- 
tion of two or more unknown quantities whose values are to 
be obtained from the observations. This obviously includes 
the case of a single quantity, aj, whose value is directly meas- 
ured; and it will be advantageous to apply the principle of 
least squares to this case. The observation equations are here 
of the simplest possible form. 

x = mi 

x = m2 

aj= wis 
etc. 

where m denotes an observed value of x. 
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If Xq denote any assumed value of x, the residuals obtained 
by substituting it in these equations will be 

Vg = mo — Xq 



• • • • • 



Vn=m„--Xo 

and [-yy] = (mj — a^y + (ma — XqY + (mg — a?o)^ h (m„ — x^y. 

The value of Xq which will make [yv"] a minimum is found 
from 

J-^===0=— 2(mi--a3o)— 2(m2--a;o)— 2(m8— ajo) 2(m^^iCo) 

CLXq 

but this equation is equivalent to 

wii -f ma + m., + ••• -f m„ [m] 

Xo = = 

n n 

and it thus appears that the universal practice of taking the ^ 
arithmetical mean of all the measures of a single quantity as 
the best value of that quantity, is a particular case under the 
more general method of least squares. 

§ 6. Weights. It frequently happens that the circumstances 
under which an observation was made lead the observer to dis- 
trust its accuracy, while other causes give him increased confi- 
dence in another observation. Observations which thus differ 
in quality are said to have different weights, the weight being 
a numerical measure of the quality, and these weights should 
be taken into account in combining the observations. 

Let us suppose two series of observations made upon the same 
unknown quantity, in one of which the individual observations 
are of different quality and entitled to different degrees of con- 
fidence, while in the other the observations are all equally good, 
but each of them entitled to less confidence than the poorest 
observation of the first series. By taking the mean of a num- 
ber of observations of this second series, a more reliable value 
of the unknown quantity may be obtained than any single 
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observation of the series can furnish, and by properly choosing 
the number of observations to be included in the mean, a value 
entitled to as much confidence as any observation of the first 
series may be found. This number of observations of the sec- 
ond series whose mean is entitled to as much confidence as a 
single observation of the first series, is called the weight of the 
equivalent observation in the first series ; and, obviously, the 
better an observation, the greater is its weight. These weights 
furnish no information about the absolute precision of the 
observations, but express only their relative excellence as com- 
pared with each other ; hence, if p^ p^ p^, etc., be the weights 
of any observations, 7q)i, kp2, Jcp^, etc., where k is any constant, 
will express these weights equally well, since it is the ratios 
of the weights, and not their absolute values, which are of 
importance. 

To exhibit the manner in which these weights are to be 
employed, let us recur to the data of § 1, and suppose that 
those observations were made under such conditions that the 
first one has a weight 1, the second 2, the third 3, and the 
fourth 4. In accordance with the definition of weights, this is 
equivalent to supposing a second series of observations of uni- 
form excellence, such that the first of the actual observations 
can be replaced by one observation of this series, which must 
of course be numerically the same as the observation which it 
replaces ; the second real observation may be replaced by two 
numerically, equal observations of the second series ; the third 
by three, etc. Each of these substituted observations will fur- 
nish an equation precisely like those given in § 1, and when 
the sum of the squares of the residuals is formedj^ we shall 
obtain 

The symbol [i^'yv], which is adopted as iin abbreviation for 
this expression, is equivalent, numerically, to the sum obtained 
by multiplying the square of each actual residual by the weight 
of the corresponding observation and adding the products, and 
it is evident that this [^pvv] bears the same relation to the 
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substituted observations that [w] bore to the actual observar 
tions in the ease of equal weights, which was considered in the 
preceding section. The principle there obtained may there- 
fore be generalized as follows : 

The most probable values of the unknown quantities are those 
which make the sum of the weighted squares of the residuals, 
{_pvv'], a minimum. 

Let the student show, as in the preceding section, that when 
this principle is applied to the case of observations of unequal 
weight made upon a single unknown quantity, it gives as the 
most probable value of that quantity 

_[pm] 

As an example of the application of weights, we select the 
following observations of the time of ending of the transit of 
Mercury of May 6, 1878, which were observed by different 
observers in the city of Washington. These observers were 
provided with telescopes of different sizes and magnifying 
powers, and differed among themselves in point of experience 
and skill, so that their observed times' of last contact are not 
entitled to equal confidence. The weights assigned to the sev- 
eral observations represent the judgment of the computer with 
respect to their relative excellence. (Washington Observa- 
tions, 1876, Appendix II., page 55.) 



Observed Time. 


P 


pm 




5^ 38™ 


23« 


1 


23« 


[pm] = 318 


37 


55 










38 


10 


1 


10 


[P] = 16 


38 


26 


3 


78 




- 38 
38 


21 

18 


2 
2 


42 
36 


^r} = 19.S 


38 


19 


3 


57 




38 


21 


2 


42 




38 


15 


2 


30 





Weighted mean of the observations = 5^ 38" 19'.9. 
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§ 7. Normal Equations. We have now to show how the 
principle of least squares is to be applied in determining the 
values of a set of unknown quantities, and in order to fix 
the ideas as definitely as possible, let it be supposed that there 
are three of these quantities, x, y, z, which are connected with 
each one of a set of observed quantities, n, by the relation 

ax-\-hy + cz=^n 

where a, 6, and c are numerical coeflB.cients whose values are 
different in the several equations but are supposed known in 
each equation. From a series of more than three observed 
values of n, the most probable values of x, y, z are to be 
obtained by means of the relation [/^vi^] = a minimum. 

It is not to be presumed that these values when found will 
exactly satisfy all the equations, and make [pv?;] = 0, but we 
shall find from each equation a residual v, so that strictly the 
observation equations should be written 

a^x + h^y + c^z — n^ = Vi p^ 

a^c + 62.V + C22; — 7i2 = V, p^ 

a^x + 63?/ + CgZ — ?i3 = Vg p^ 
etc. etc. etc. 

The symbols pi, P2, Ps represent the weights assigned to the 
observed values, n^ n.^, n^ etc. 

By the ordinary rule for determining a minimum of a func- 
tion of several variables, the condition [ptv?] = a minimum, 
furnishes the three equations 

dx dy dz 

and in order to obtain these derivatives we form from the 
observation equations 

[pvy] = { aiVp^x + biVply + Ci Vi^i z - Vp^n^ | 

+ {ctsVp^x + b^Vp^y + Cs-y/psZ — V^ws}* 
etc. etc. etc. >f 



2 \ 



> (5) 
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The derivative of this expression with respect to x is 

-^-^^ = 2 aiVpllai-y/plx + hy-s/p^y + CiVpi^ — VpT^i} 
ax 

+ 2a2Vp^i\a2-\/p2^ + hVpiV + C2yp2'2^ — Vi>2^2| 
+ 2a^-\/p^\a^^/p^x + b^Vpay + CgVps^ -- VpsWg} 
etc. etc. etc. = 



\ 



> (6) 



J 



Let this expression be expanded, divided by 2 and simplified 
by the introduction of [ ] to denote the sum of all terms like 
those placed within them (all terms standing in the same verti- 
cal column), and it becomes the first of the following group of 

Normal Equations. 

[j9aa] X + [pad] y -{- [_pac] z — [jpan] = 

[pad] X + [p6&] y + [pbc] z — [p6n] = )> (7) 

[^pac] X + [jp6c] y + [jjcc] 2 — [jjc^i] = 

The second and third of these equations are derived in pre- 
cisely the same manner as the first from the conditions 

d [pvv'] ^ Q d Ipvv'] ^ Q 
dy dz 

These equations are equal in number to the unknown quanti- 
ties, and their solution will in general furnish a determinate 
set of values for these quantities which will be the most prob- 
able values, since the normal equations include all of the data 
furnished by the observations and have been so derived as to 
satisfy the principle of least squares. 

Equations (6) furnish a rule which is frequently given for 
the formation of normal equations. To obtain the first normal 
equation, multiply each observation equation by the product of 
its weight into the coefficient of x which occurs in it, and take 
the sum of all the resulting equations. The other normals are 
similarly obtained from the weights and the coefficients of y, 
z, etc., having due regard to the algebraic signs of the quanti- 
ties in the several multiplications and divisions. This method 
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is occasionally convenient, but in general the method of form- 
ing normal equations given in § 9 will be found less laborious. 

The symmetrical manner in which the coefficients of the 
normal equations are disposed should be especially noted, since 
this considerably diminishes the labor of their formation. The 
first coefficient in the second equation is the same as the second 
coefficient in the first equation, and generally the tn}^ coefficient 
in the w*** equation is the same as the n^ coefficient in the m^ 
equation. 

Let the student form normal equations from the observation 
equations contained in § 1, assuming that those equations have 
equal weights. 

§ 8. Non-Linear Observation Equations. In all of the pre- 
ceding investigation, it has been tacitly assumed that the 
relation of the observed to the unknown quantities can be 
expressed by an equation of the first degree; but cases in 
which this relation is of a much more complicated character 
are not uncommon, and a method of applying the principle of 
least squares to these cases is required. For the sake of sim- 
plicity, this method will be derived for the case of two un- 
known quantities, but the process is perfectly general and can 
readily be extended to any other number of unknowns. 

Let X and y be any two quantities which have not been 
directly measured but which are connected with an observed 
quantity, m, by the relation 

/(«, y)=m 

which represents any equation whatever existing between x, y, 
and m. Let Xq and y^ denote approximate values of x and y, 
such that , . , . 

x = oco + Ax y = yo + ^y 

A a? and Ay being the corrections which must be added to Xq and 
yo in order to obtain the most probable values of x and y. We 
may, for the present, suppose that Xq and yo are mere guesses 
at the values of x and y, and we may test their correctness by 
substituting their numerical values in the equation 

f{x, y)==m 
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which corresponds to each observed value of m. If every such 
equation were exactly satisfied by these values, we should infer 
that Xq and yo were the most probable values of x and y. It 
cannot be expected that this perfect agreement will ever be 
found in practice, but from each observation equation a resid- 
ual, V, will be foiind, due partly to the errors of the observar 
tions and partly to A a? and Ay. 

If in the equation /(«, y)=.m we substitute for x and y 
x^-k-Ax, yQ + Ayy and develop the expression by Taylor's 
Formula^ remembering that m — /(a^o, y^) is the residual n 
found by substituting numerical values of x^ y^ in the several 
observation equations, we have 

-, V dm . , dm . . 
m-'f{xo,yo)=-—Ax+ --Ay + "- 

dxo dy^ 

If numerical values of — , — be introduced into this 
•equation, it becomes ^ ^^ 

a • A aj + 6 • A 2/ — n = 

Each observation equation may thus be made to furnish a lin- 
ear equation involving A x and A y, and these equations may 
be treated by the method of § 7. It must, however, be remem- 
bered that in the above development by Taylor's Formula we 
have retained only the first three terms of an infinite series, 
and if the approximate values x^, y^ are not so nearly the most 
probable values that the squares and higher powers of A a? and 
A y are inappreciable, the development and the solution based 
upon it are inaccurate. On this account, it is seldom advanta- 
geous to make a least square solution for the unknown quanti- 
ties until very approximate values of them have been found. 
These values will usually be obtained from the solution of a 
small number of the observation equations. 

The transformation of the observation equations by the 
introduction of corrections to assumed values of the unknowns 
is often advantageous even when the original equations are of 
the first degree, especially if the original quantities were of 
very different magnitudes. Thus, in the problem of § 1, the 
observation equations are of the form 
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in which c is a very small quantity while Zq is approximately 
1000. If we put 

Zo = 1000-fAZo c = OH-Ac 

we have - — = 1 — = t n = m — 1000 
(Mq dc 

and the equations are transformed into 

AZo + 20Ac- 0.22 = 
Aio-f40Ac- 0.65 = 
AZo + 50Ac- 0.90 = 
AZo + 60Ac- 1.05 = 

By this transformation the numerical operations involved in 
forming and solving the normal equations are much simplified 
through the substitution of small numbers in the place of 
large ones. 

§ 9. Fomiation and Solntion of the Normal Equations. If 

the number of unknown quantities is greater than two, and 
especially if the number of observations is large, the numerical 
computation of a set of normal equations is a laborious process, 
and one in which errors are almost certain to occur unless 
special precautions are taken to guard against them. The 
method of forming these equations presented in this section 
has been developed with special reference to facilitating the 
numerical operations and obtaining the normals with the least 
expenditure of labor consistent with the requisite accuracy, 
and although some of the processes may seem at first sight 
unnecessary and cumbrous, a little experience in their use, or 
in their neglect, will convince the student that they are in 
the long run labor-saving devices. 

Let each observation equation be written out and arranged 
in tabular form, as in the following example. In order that 
these equations should furnish a good determination of the 
unknowns, x, y, z, it is necessary that the coefficients of these 
quantities should present a considerable range in their values 
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in the several equations. Thus, if all the coefficients of x 
were alike, all the coefficients of y equal each to each, etc., the 
equations would be absolutely indeterminate since* we should 
have several unknown quantities involved in a single equation 
many times repeated, and if the coefficients approximate to 
this equality the equations will be approximately indeter- 
minate, and will furnish unreliable values of the unknowns. 
If, therefore, several observations have been made under similar 
conditions, and furnish equations which are nearly identical, 
these will be nearly equivalent to a repetition of the same 
equation and it will be permissible to take their mean, having 
regard to their respective weights, and treat it as a single 
observation equation with a weight equal to the sum of the 
weights of the observations. 

Having thus reduced the number of equations as far as 
possible, each equation should be multiplied by the square root 
of its weight as was done, § 7, in obtaining the form of the 
normal equations. By this multiplication the weights will be 
completely taken into account and will require no further 
attention. Let the weighted equations thus obtained be repre- 
sented by 

QiX -\- bjy -\- CiZ -^ rii = 

a^ + b^ -\-c^ -\- 712 = 

Chf^ + b^ + CsZ -\- lis = 

etc. etc. etc. 

It will usually facilitate the formation of the normals to so 
transform these equations that no number greater than 1 shall 
occur in any of them. This can always be done by introducing 
new unknown quantities and dividing each equation by some 
constant number, usually some power of 10. Thus in the case 
of the two equations 

5aj + 712/ -63 = 
0.9a? -1932/ -{-93 = 

let each equation be divided by 100 and put 

■j^a? = ?4 IMy = ^; 
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the equations are thus transformed into 

1.000 i* + 0.368 w - 0.630 = 
0.180 m - 1.000 w + 0.930 = 

The solution of these equations will furnish values of u and w 
from which x and y may be found by the relations 

a; = 20% y = iM^^ 

The purpose of this transformation is to simplify the subse- 
quent numerical work by reducing the numbers involved to 
an approximate equality. 

Every coefficient which appears in the normal equations is 
the sum of a series of products of two quantities, thus 

[aa] = ayCLi + a^2 + <V^z H 

[aft] = Oihy + aj)2 -h (^^ + ••• 
[6n] = ^iiii 4- 62^2 + ^aWg H 

These products may be formed by the aid of Crelle^a multipli- 
cation tables* supplemented by a table of squares of num- 
bers for the [aa], [66], etc. In case Grelle^s tables are not 
available, the products may be formed by logarithms or much 
more rapidly by the following method due to BesseL Form for 
each equation the sums a -\- b, a -\- c, b -{- c, etc., for every pair 
of numbers contained in the equation ; then since 

ab = i\(a-\-by-aa-bb\ 

we have [a6] = i { [ (a + 6)^] - [aa] - [66] } ^ 

[6c] =i{[(6+c)^]- [66] -[cc]} I (8) 

etc. etc. etc. J 

The [oa], [66], [cc] are coefficients in the normal equations 
and must be computed in any case, and the formation of [a6], 
[6c], etc., therefore requires for each coefficient only the single 
additional quantity [(a + 6)^], [(6-{-c)^], and presents the 
very great advantage that these quantities can be obtained 

* Crelle, Rechentafeln, Berlin. These tables give the products of 
aU numbers up to 1000 x 1000, and are of very general utility. 



26 



THE METHOD OF LEAST SQUARES. 



from a table of squares, and being all positive numbers no 
attention need be paid to the signs after the sums a + b, 6 -h c, 
etc., have been formed. 

No method of computation can furnish a guaranty against 
the commission of numerical errors, and it is therefore desir- 
able to test the computation from time to time to ascertain if 
such errors have occurred. To secure such a test or " check," 
as it is called, we introduce the following auxiliary quantities, 
one for each observation equation : 

«i = «i + ^1 4- Ci -I h ni 

«2=«2 + &2 + C2H 1-«2 V (9) 

etc. etc. etc. J 

and form the quantities [as] [6s] ••• [sw]. It will appear from 
the mode in which the coefficients of the normal equations 
are formed that 

[^aa] + [«&] + [etc] H + [aw] = [as'] 

[a6] + [&6] + [6c] + ...-f [6n] = [6s] } (9a) 
etc. etc. etc. 



Check 



The [^as"], [6s], etc., are formed in precisely the same manner 
as [a6], [^ac], etc., and the check relations above given must 
be satisfied by the computed values of these quantities. 

Where only two unknown quantities are involved in the 
normal equations the solution of the equations may be con- 
veniently made by any of the methods of elementary algebra, 
but if the number of unknowns is greater than two, the simple 
and elegant method of successive substitutions proposed by 
Gauss may be employed with advantage. 

The normal equations in the case of three unknown quanti- 
ties are : 

[^ad] X -\- [ah'] y -\- [ac] z -f- [an] = 

[ab] x-\-[bb]y-\- [be] z + [bn] == 

[ac ]X'\-[bc]y+[cc]Z'\- [en ] = 

and from the first of these 

[aa] [ao,] [aa] 



(10) 
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Tliis value of x substituted in the second and third equations 
transforms them into 



in which 



[bb . l]y + [&c . 1]2; + [6n . 1] = 
[6c . l]y + Ice . 1]2J + [c?i . 1] = 



(11) 



[66.1]^[66]-.gy[a&] 

[&c.l] = [6o]-M[ac] 

[aaj 

[6n.l] = [6n]-t^[an] 



[cc • 1] = [cc ] — 
[6c.l] = [6c]- 



[cn« 1] = [en] — 



[^ac 



[^ac 



[ac 



[oa 



lac'] 
lab] 



M12) 



Ian] 



J 



These equations constitute a new set of normals, from which 
one unknown quantity has been eliminated. The correctness 
of the numerical work of this elimination may be tested by a 
continuation of the checks used in forming the original nor- 
mals. We introduce an auxiliary quantity 

[6s.l] = [&6.1] + [6c^l] + [6w.l] 

and inquire its relation to [as], [6s], etc. If we substitute in 
the expression for \bs • 1] the values of \bb • 1], \bc • 1], \bn • 1] 
in terms of the original coeflS.cients, having regard to the 
relations 

laa] 4- lab] -h la^ + [aw] = [as] ) 

[a6] + [66] + [6o] + [6n] = [6s] j 

[6s . 1] = M - [oft] - M J [as] - [oa] } 

[aft] 



(13) 



we find 
whence 



> (14) 



\b8 . 1] = [65] - f— 4 [as] 



[aa] 



and similarly. 



[cs.l] = [cs]-M[a«] 
*- J L J laa^ ^ -^ 

We may therefore obtain a complete check upon the accuracy 
of the numerical work involved in the elimination of a?, by 
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forming the quantities [bs -1], [cs • 1], in the same manner as 
[66-1], [&C-1], [6n«l], etc., and comparing the actual sums 
of these latter quantities with the computed check quantities. 
By a repetition of the process of elimination we obtain 

[cc.2]2 + [cn.2]=0 Check [cs.2] 

[6c . 1] 



where [cc • 2] = [^cc • 1] — 



Ibb . 1] 



Ibc . 1] 



N 



[cn . 2] = [cn . 1] ~ [[{ ' ]] [bn » 1] 



[C8.2] = [cs.l]-- 



Ibb ' 1] 



> 



[bb ' 1] 
= [cc.2]+[cn.2] 



[bs . 1] 



(15) 



y 



and we are enabled to write the following equivalents for the 
original normal equations. 



Elimination Equations. 



x + 



[a6] 



y + 



y + 



[ac] 

[be ' 1] 
[bb ' 1] 



2 + 



« + 



« + 



[art] _ 
[aa] 

[6/1 ■ 1] 
[66T1J 

[cyi « 2] 
[cc . 2] 



\ 



= 



= 







> 



(16) 



y 



The last of these equations gives the value of z directly, the 
second furnishes y as soon as z is known, and the first gives 
the value of x. The whole solution is therefore reduced to 
finding the values of the coefficients and absolute terms in 
these elimincUion equations. A convenient arrangement of the 
computation by which these quantities are obtained is given 
in the following example, in which the actual computation is 
exhibited upon one page, and the opposite page contains a 
schedule correspondingly arranged showing the analytical 
equivalent of each number contained in the computation. 
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In making the multiplications of [a6], [oc], {_an'], [as], by 

the constant factor L^ j, the logarithm of this factor is written 

[aa] 

on the edge of a slip of paper, and being held successively- 
adjacent to the logarithms of [oft], [ac], [a/i], [a«], the sum of 
the two logarithms is taken mentally, the corresponding num- 
ber looked out from a logarithmic table and written in its 
proper place under [66], [6c], [6n], [6s], a subtraction then 
gives the value of [66 • 1], [6c • 1], [bn • 1], [hs • 1], and a simi- 
lar process is followed for each other derived coefficient. 

§ 10. Example. To illustrate the principles contained in 
the preceding sections, and to exhibit in detail the process of 
deriving the most probable values of several unknown quanti- 
ties which are connected with the observed quantity by a 
rather complicated relation, we select from Vol. iii. Part 1 of 
the Memoirs of the National Academy of Sciences, page 58, the 
following series of experiments made with a 10-gauge Colt 
gun, loaded with uniform charges of four drams of powder and 
1^ ounces of shot, the shot ranging in fineness from No. 10 
up to No. 1 Buck. The purpose of the experiments was to 
determine the relation existing between the size (fineness) 
of the shot and its average velocity over a range of 30 yards. 
The following table contains the results of the experiments, 
each velocity being the mean result of from three to six dis- 
charges of the gun. The weight of a pellet of No. 10 shot is 
taken as the imit of weight, and the velocities are expressed 
in feet per second. 

size. TVeigfht. Observed Telocity. 



No. 10 


1 


848 


8 


2 


920 


6 


4 


966 


3 


8 


989 


BB. 


16 


1000 


FF. 


32 


1017 


No. 1 Buck 


64 


1067 



By plotting these results in a curve with the weight of the 
shot as abscissas, and the observed velocities as ordinates, the 
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experimenter reached the conclusion that the relation between 
the weight TFand the velocity T^ is expressed by an equation 
of the form 

sec * = 

I m 

in which Z, m, n, are constants whose values are to be deter- 
mined from the observations. It will be found upon trial that 
Zq = 700, mo = 0.28, .no = 0.42, in connection with the observed 
values of Y and W will approximately satisfy this equation, 
and we therefore adopt these approximate values and proceed 
(§8) to determine the corrections AZ, Am, Aw, which when 
added to l^ m^ Wq, will furnish the most probable values of 
Z, m, n. 

The several differential coefficients of the observation equa- 
tion, V =/ (^, m, 71, W^ are 



dV 


V 






dlo 


k 






dV 
d'/tio 


= — 


mo 


cot ^ 

^0 


dV 

driQ 


: ^« log 


TTcot^ 

^0 



in which M denotes the modulus of the common system of 
logarithms, M= 0.43429. In the factor, cot—, — is the ratio of 

two numbers, and must be construed as representing a certain 
arc expressed in parts of the radius: the corresponding arc 

expressed in degrees is 57°. 2958—. 

The form of the observation equation with which we are 
here concerned is 

E. AZ - -^ cotE. Am + -^logTrcot^An -(V-k sec"^— Vo 

and introducing into this equation the numerical values of 
Zq, mo, Wo, V, W, M, we find the following 
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Observation Equations. 

(1) 1.29 AZ- 729 Am + An + 53 = p = 1.0 

(2) 1.36 -535 +104 +32 = 1.0 

(3) 1.41 -396 +154 +24 = 0.8 

(4) 1.45 -294 +172 +28 = 1.0 

(5) 1.48 -219 +170 +38 = 1.2 

(6) 1.50 -164 +159 +37 = 1.1 

(7) 1.52 -122 +142 - 2 = 1.0 

The absolute terms of these equations are residuals obtained 
by substituting in the original equation 

r— /sec-i-^ = 
m 

the assumed values of l^^ rriQ, and n^y and the smallness of these 
residuals compared with the values of V, shows that the 
assumed quantities are approximately correct values of I, m, n. 
This computed value of V was used instead of the observed V 
in deriving the coefficients of the equations. The memoir from 
which our data are taken contains no indication of the weights 
to be assigned to the several determinations of V, and in the 
absence of such information they should all be treated as 
equally precise and given the weight 1 ; but for the sake of 
illustration a slightly different set of weights indicated above 
by p has been assigned to them, and by multiplying each equa- 
tion by the square root of its weight we obtain the following 

Weighted Observation Equations. 

1.29 A? — 729 Am + 0An+53 = 

1.36 -535 +104 +32 = 

1.26 -352 +137 +21 = 

1.45 -294 +172 +28 = 

1.62 - 239 +185 + 42 = 

1.58 -172 +167 +38 = 

1.52 -122 +142 - 2 = 
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The coefficients and absolute terms in these equations are of 
very different magnitudes, and to simplify the subsequent 
numerical work we divide each equation through by 100 and 

A Z = , A m = — ^— , A n = - — 

0.0162' 7.29 1.85 

and introduce x, y, z into the equations in place of A Z, A m, A n. 
This step, which is frequently called rendering the equations 
homogeneous, furnishes the following 



Homogeneous Weighted Observation Equations. 



0.7960? 


- 1.000 y + 0.0002 


+ 0.530 


= 


s = + 0.326 


0.839 


- 0.733 


+ 0.563 


+ 0.320 


= 


0.989 


0.777 


- 0.482 


+ 0.741 


+ 0.210 


= 


1.246 


0.895 


- 0.403 


+ 0.931 


4- 0.280 


= 


1.703 


1.000 


- 0.327 


+ 1.000 


+ 0.420 


= 


2.093 


0.975 


- 0.236 


+ 0.903 


+ 0.380 


= 


2.022 


0.938 


- 0.167 


+ 0.768 


- 0.020 


= 


1.519 



i 
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The values of s=a + 6+c + 7i, which are to be used as a 
check in the formation of the normal equations, are derived 
from these equations. 

The formation of the coeflB.cients of the normal equations by 
the use of a table of squares, BesseVs method, is represented in 
the following tables : 



Sums of the Coefficients. 



Equation 


a + b 


a + e 


a + n 


a + 8 


b + c 


b + n 


b + s 


c + n 


c + s 


1 


0.204 


0.796 


1.326 


1.122 


1.000 


0.470 


0.674 


0.630 


0.326 


2 


.106 


1.402 


1.159 


1.828 


.170 


.413 


0.256 


0.883 


1.652 


3 


.295 


1.518 


0.987 


2.023 


.259 


.272 


0.764 


0.951 


1.987 


4 


.492 


1.826 


1.175 


2.598 


.528 


.123 


1.300 


1.211 


2.634 


5 


.673 


2.000 


1.420 


3.093 


.673 


.093 


1.766 


1.420 


3.093 


6 


.739 


1.878 


1.355 


2.997 


.667 


.144 


1.786 


1.283 


2.925 


7 


.771 


1.706 


0.918 


2.457 


.601 


.187 


1.352 


0.748 


2.287 



34 



THE METHOD OF LEAST SQUARES. 








CO 


00 


CO 


o 


l-H 


00 


t^ 


©q 


• 


©q 






O 


t^ 


»o 


o 


00 


00 


o 


r-i 




I-H 




^ 


l-H 


05 


U3 


Oi 


CO 


o 


CO 


CO 


• 


CO 


r— I 




■ 








• 




• 


• 




• 


00 


4t 


o 


d 


I-H 


cq 


^ 


-1^ 


©q 


CO 
I-H 


• 
• 


CO 
l-H 

+ 


09 
1 1 




»— 1 


©q 


"^ 


00 


CO 


"^ 


Q 


»o 




»o 






oo 


o 


"^ 


l^ 


t^ 


M< 


f^ 


©q 


• 


©q 


1 1 


8 


• 

o 


T-H 

• 


o 

• 


o 

• 


I-H 

■ 


m 


• 


00 

• 

o 


• 
• 


00 

• 

o 

+ 


1 — t 


M 


«x> 


s 


00 


00 


t^ 


CO 


o 


^ 


O 


©q 


CO 


"•T 


o 


o 


'^ 


CO 


CO 


U3 


CO 


»o 


o 


o 


1— 1 »o 


1— 1 


-* 


a> 


Oi 


no 


iO 


©<« 


t^ 


"^ 


r^ 


00 »-< 


+ 


• 

o 


cci 


'co 


CO 


oi 


00 


o 


CO 


o 


00 


^ 00 


9 
















CO 


©q 


-H 


+ 


ft 


l-H 


o 


I*" 


CO 


©9 


CO 


o 


cs 


CO 


oo 




s 


00 


00 


o 


CO 


©9 


-* 


CO 


IQ 


CO 


M< 


1 — 1 


©q 


t^ 


a 


HJ 


O 


CO 


»o 


CO 


Oi 


CO 


ss 


+ 


• 

o 


d 


• 

o 


rA 


©q 


• 

l-H 


• 

o 


• 


■ 


• 

l-H 


1 1 


^f 








■ 












+ 






§ 


b- 


CT» 


t^ 


Q 


lO 


o 


00 




00 






»-H 


rt* 


CO 


(^ 


^H 


cs 


CO 


• 


CO 


1—1 


^ 


• 


CO 

« 


to 


00 

• 


o 

• 


00 


ft 


l-H 

• 


• 


l-H 

• 


§ 




o 








»-i 






Th 


• 


H- 


1 — 1 


M 


-* 


CO 


•^ 


o 


Oi 


o 


00 


T-H 


"* 


l-H 


©q 




»o 


CO 


OO 


Oi 


l-H 


Oi 


©) 


CO 


CO 


lO 


r— 1 "3 


'«♦« 


o 


»o 


CO 


I-H 


l-H 


00 


a 


'^ 


t^ 


00 t^ 


+ 

5. 


d 


• 

o 


d 


• 

l-H 


CO 


CO 


• 

l-H 


d 

r-l 


00 

T-H 


CO 

1 


^1 




l-H 


l-H 


"* 


»o 


a 


l-H 


»o 


CO 


t^ 


s 




8 


09 


t^ 


t^ 


l-H 


o 


©1 


CO 


''♦H 


"^ 


1 — 1 


09 


f^ 


o 


O 


o 


o 


o 


>0 


Cd 


©q 


si 


+ 


• 

o 


• 


• 


• 


■ 


• 


• 


• 

o 


©q' 


• 
T-H 


tO 


>^^ 




















1 




M 


8 


CS 


t- 


a 


CO 


U3 


l-H 


"^ 


o 


CO 






cq 


CO 


t^ 


»o 


M< 


CO 


CO 


CO 


l-H 


1 — 1 


+ 


o 


o 


o 


©9 


"* 


"^ 


CO 


CO 


©) 


00 


o 


■ 


• 


• 


• 


« 


• 


• 


• 




• 


»o 


•o 


T-H 














©q 


CO 


l-H 


I—I 






















1 






Q 


t^ 


<M 


©9 


t^ 


CO 


00 


©q 




©q 






^5 


CO 


CO 


CO 


o 


IQ 


©q 


©q- 


• 


©q 


( — 1 


:;& 


• 

l-H 


• 


• 


l-H 

• 


I-H 

• 


o 

• 


o 

• 


l-H 

©q 


• 
• 


l-H 

©q* 

+ 


tO 
1 1 


M 


Oi 


<N 


-* 


o 


l^ 


©9 


l^ 


I-H 


00 


I-H 


o 




lO 


'^ 


Oi 


»o 


CO 


oo 


CO 


CO 


00 


l^ 


t— 1 t^ 


1 


C<l 


CO 


O 


I-- 


»o 


o 


p 


O 


QO 


o 


as o 


+ 


• 

l-H 


CO 


• 


CO 


ai 


o6 


CO 


■ 

o 


• 
l-H 

©q 




H- 


M 


00 


CO 


-^ 


l-H 


CO 


CO 


CO 


l-H 


l-H 


»o 




s 


U3 


^ 


t^ 


00 


»-H 


CO 


-* 


»o 


o 


b- 


r— 1 


t- 


CO 


a 


CO 


O 


00 


00 


l-H 


^ 


00 


S 


+ 


• 

l-H 


• 

»-H 


d 


• 

I-H 


©q 


I-H 


d 


• 

o 

I-H 


CO 


• 
T-H 

H- 


1 1 


M 


^ 


CO 


-^ 


M< 


o 


t- 


o 


o 


^ 


o 






CO 


CO 


o 


CO 


o 


N 


l-H 


l^ 


l-H 


00 


1 — 1 


CO 


05 


CO 


/'CO 


o 


Ui 


a 


CO 


It- 


-^ 


o 




d 


l-H 


©4 


CO 


"^^ 


CO 


©q 


00 

T— ' 


a> 


• 

+ 


e 


N 


C<1 


1—1 


b- . 


©» 


CO 


CO 


'<*' 


lO 


00 


l-H 




+ . 


'^ 


l-H 


00 


'^ 


»o 


'^ 


a 


l^ 


a 


CO 


r— 1 


o 


o 


o 


©q 


•<*< 


lO 


IQ 


Ci 


CO 


oq 


rfi 


■ 

o 


• 


• 


• 


• 






r-J 


• 


©q 

1 


i_i 




'«*< 


'^ 


'* 


l-H 


o 


T-H 


O 


CO 




CO 






CO 


o 


o 


o 


o 


§ 


OO 


l^ 


• 


i^ 


1 — 1 


^. 


CO 


t- 


CO 


00 


q 


00 


»o 


• 


»o 


e 


d 


■ 


• 


• 


l-H 


• 




ifi 


• 


W5 


1 1 






















+ 




a S 


fH 


(M 


CO 


"^ 


lO 


CO 


t- 



































EXAMPLE. 36 

From the sums of the squares contained in the several col- 
umns of this table the coefficients [a6], [^ac], etc., are computed 
at the foot of the columns by the relations 

[aft] = i I [(a + 6)'] - ([a'J + [6»]) |, etc. 

The check quantity [as] is compared with 

\_aa] -h [a6] -f [^ac] -f- [^an] 

whose value is written immediately under [as], and which 
must agree with [as] within two or three units of the last 
decimal place. Every coefficient of the normal equations 
enters into one or more of these sums, which therefore furnish 
a complete test of the accuracy of the work in passing from^ the 
homogeneous observation equations to the normal equations. 
We now write the 

Normal Equations. 

+ 5.576 a? - 2.861 y -h 4.480?; + 1.875 = 
- 2.861 a; -f 2.1222^ - 1.8132 - 1.200 = 
+ 4.480a? - 1.813y -f 4.1382 + 1.348 = 

It may be seen from an inspection of these equations that 
the data upon which they are based will not furnish a good 
determination of the values of all the unknowns, for if the 
first equation be divided by — 2 the quotient will be very like 
the second equation, and if it be multiplied by -h |^ the product 
will be very like the third equation. We proceed, however, 
with the solution by Gaus^ method, which will furnish the 
best results that the data can be made to yield. 
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Solution op the Normal Equations. 



log laa} log [a6] 



[«c] 
log lac] 



log Ian] 



[as] 
log [as] 



log 



[att] 



log 



log 



log 






[6c . 1] 
[66 . 1] 



[en ' 2] 
Ice ' 2] 



[66] 



[a«] 



[6c] 
[ac] 



[ab] 
[an] 



[6n] 
[an] 



[a6] 



[6s] 



[66 . 1] 



[6c . 1] 



[6n . 1] [6s . 1] 

Check sum. 
log [66-1] log [6c. 1] log[6n.l]^ log [6s. 1] 



[ac] 
\_aa] 






M 

\_aa] 



Icn] 
Ian] 






[cs] 
[«s] 



[cc . 1] 



[c» . 1] 



[^^•13.[6c.l] C^[6«.l] 



Check sum, 
[6c.l], 



[66 . 1] 



[66.1]' 



[66-1] 



[6s . 1] 



[cc . 2] 



[en . 2] 



log Ice • 2] log [en . 2] 



Ics . 2] 
C%ecA; sum. 



Elimination Equations. 



[a6] 
x + ^-^y 
laa] 



+ S^ + 
laa] 

+ K1^- + 



[66.1] 



+ 



[atq 
laa] 
Ibn . 1] 
[«^6 . 1] 
[c» . 2] 
[cc . 2] 









The course of the computation after the formation of the elimina- 
tion equations is sufficiently indicated upon the opposite page. 
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Solution of the Normal Equations. 



9.7102 n 



+ 6.676 


- 2.861 


+ 4.480 


+ 1.875 


+ 9.070 


0.7464 


0.4566 n 


0.6513 


0.2730 


0.9576 




+ 2.122 


-1.813 


- 1.200 


- 3.752 




+ 1.468 


- 2.299 


- 0.962 


- 4.654 




+ 0.664 


+ 0.486 


- 0.238 


+ 0.902 




9.8156 


9.6866 


9.3766n 


9.9552 



9.9049 



9.8710 



9.0049 . 



+ 4.138 +1.348 +8.153 

+ 3.599 + 1.507 + 7.286 

• 

+ 0.539 -0.159 +0.867 

+ 0.361 -0.177 +0.670 
+ 0.178 +0.018 +0.197 



9.2504 



8.2553 



+ 0.902 



+ 0.866 



+ 0.196 



Elimination Equations. 

ic — 0.513y + 0.803 « + 0.336 = x = — 0.030 

y + 0.743 z - 0.364 = y = + 0.439 

2 + 0.101 = 2 = -0.101 



log a: 


8.4771 n 


logy 


9.6425 


log 2 


9.0049 n 


log 0.0162 


8.2095 


log 7.29 


0.8627 


log 1.85 


0.2672 


a; 


-1.8 


*Am 


+ 0.0602 


An 


- 0.0547 


k 


700.0 


PIq 


+ 0.2800 


«o 


+ 0.4200 


I 


698.2 


m 


+ 0.3402 


n 


+ 0.3663 
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If with the values of Z, m, n thus obtained the correspondiug 
velocities be computed by means of the original equation 

V _i W» 

— = sec^ 

I m 

the resulting residuals should be smaller than those derived 
from the substitution of l^ ttiq, no, i.e., the absolute terms of 
the observation equations. The following comparison of these 
residuals shows a much better representation of the observed 
values of F, especially if the sums of the squares, [yv], be 
compared. 

Observed— Computed F. 

Weight Of Shot 1 2 4 8 16 32 64 

/(l^m^no) -53, -32, -24, -28, -38, -37, +2 

/(?, m, w) - 6, +10, +13, + 4, -^10, -13, +22 

Not only are the residuals diminished in magnitude, but 
their distribution is much more nearly in agreement with the 
law of error. 

The values thus obtained, for I, m, n ought not to be con- 
sidered the best attainable, since the corrections Aniy ^n are 
relatively large fractions of rrio and rio, and it is probable that 

■J g 

the neglected terms containing Am, A ?i , etc., have an appreci- 
able influence upon the solution. To secure the utmost accuracy 
these values of I, m, n should be treated as new approximations 
and another set of corrections A Z, Am, An derived. This re- 
solution is recommended to the student as a valuable exercise. 
Let the student also derive from the data of § 1 the most 
probable values of Iq and c, assigning unequal weights to the 
several equations. 

§ 11. Conditioned Observations. There is a class of cases in 
which the application of the principle of least squares seems 
to produce absurd results. Thus if each angle of a plane tri- 
angle be measured many times in order to obtain an accurate 
set of values for the angles, the application of the principle 
that the [piw] must be made a minimum will furnish as the 
most probable value of each angle the weighted mean of the 
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measures of that angle, but the sum of these weighted means 
will usually differ slightly from 180°, and since the sum of the 
angles of every plane triangle must equal 180° it appears that 
the most probable values above derived are impossible values. 
It must, however, be noted that the method of treatment above 
outlined is itself a violation of Principle A, § 1, since the knowl- 
edge that the sum of the angles must equal 180° furnishes a 
relation among those angles^ which may be used and ought to 
be used in determining their most probable values ; and the ap- 
parent absurdity above found is produced by neglecting this 
part of the data. 

A relation such as the above which must be exactly satisfied 
by a set of observed quantities is called a rigorous condition, 
the equation by which the relation is expressed is called an 
equation of condition, and observations of such quantities are 
known as conditioned observations. The number of rigorous 
conditions is, of course, always less than the number of un- 
known quantities, since if it were equal to the number of such 
quantities the values of the latter would be determined by the 
conditions alone, independently of any observations. 

In order to develop a convenient method of treating rigorous 
conditions, let «, y, z be three unknown quantities which are to 
be determined from observation, but whose values are required 
to satisfy the equations of condition 

<f> {x, y,z) = if; (a, y, 2) = 

Let the measurements or observations for the determination of 
the unknown quantities be represented by observation equa- 
tions of the form 

/i(a;,m) = /2(y,ri) = /sC^, g) = 

m, 71, and q being the quantities directly measured, and the 
measures for the determination of x being quite independent 
of those for y, 2, etc. In accordance with the principles of 
least squares the values of the unknown quantities are to be so 
determined that \_pvv] shall be made a minimum in each series 
of observations above represented, and therefore the sum of all 
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the weighted squares of the residuals must also be a minimum. 
Owing to the conditions <^(a;, y, z) = 0, il/(x, y, 2;) = it will not 
in general be possible to assign to the unknown quantities 
values which will give to [^pw^ its least possible value, and the 
problem becomes one of conditioned or relative minima, i.e. 
out of all the sets of values of a?, y, z which will exactly satisfy 
the equations of condition it is required to find that set which 
assigns to [pvv] its least value consistent with those equations. 
The method of determining relative minima is as follows : 
(Jordan, Cours d' Analyse, Vol. i., § 205). Multiply each equa- 
tion of condition by an undetermined constant factor, and add 
the products to the function which is to be made a minimum. 
The derivative of the new function with respect to each un- 
known quantity must be placed equal to 0, and the equations 
thus formed, together with the equations of condition, will be 
just sufficient to determine the unknown quantities and the con- 
stant multipliers. Thus, in the present case, representing the 
multipliers by — 2 fci and — 2 Zcg, we have for the new function 

w = [pvv] — 2ki<l>{x, y, 2) — 2 k2\l/{x, y, z) (17) 

and ^=0 ^ = ^ = (18) 

dx dy dz 

<l>(x,y, z) = xl;{x,y,z) = 

will determine k^ k^, x, y, and z. 

It was shown in § 7 that in general for three unknown 
quantities, 

d\^pw'] 



dx 



= 2[2>aa] X + 2[^pab'] y -\- 2[^pac] z -h 2lpan'} 



but in the case here considered those observation equations 
which contain x do not contain either y or z, and, therefore, the 
6 and c coefficients in those equations are to be considered 
zero, all the products ab, ac, are also zero, and 

^tg!J = 2[paa']x + 2[pan] 
with similar expressions for the y and z derivatives. 
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Denoting for the sake of brevity <t>(Xy y, z) and j/r(a?, y, z) 
by <f} and ^ respectively, we obtain by differentiating w 

^^ = [paalx -\-[panl - k,^ - A;^^ = 
" dx dx dx 

dy dy dy 

i^ = lpcc:\z+[pcn:\-kM-kM=0 
dz dz dz 

from which 

^ _ [pan] d(l> ki # h 
[_paa'] dx [^paa2 dx [poa] 

[p6n] .d<l} ki duf/ feg 

^■"""[p66]'^dy'[^66] ^"[p66] 

__ [pen] , c?0 ^1 .d\f/ k^ 
"" [i>^<^] ^ [P^] ^^ [pec] 

These equations determine the values of x, y, z, when k^ and 
.fej, are known, and it should be observed that the first terms of 
the second members of the equations are the values of x, y, z, 
which would be obtained by treating the observations as if 
these quantities were entirely independent of each other, e.g. 
in the case of direct observations of the quantities they are 
the weighted means of the observations. If we represent the 
values thus obtained by iCo, yo, Zq, and represent by v^ v^, Vg the 
corrections which must be added to these quantities in order 
to obtain the most probable values of x, y, z, i.e. put 



0? = a?o + Vi y = yo + V2 


Z=2Z^ + V^ 


we shall have 


d<t} ki dif/ ^^2 ^ 
^ "" da: [paa] dx [^paa"] 




d<t> ki d\f/ k2 
^* " dy [p66] ^ dy Ipbb} 


> 


d<f> ki ^ dip k^ 





(19) 
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The quantities k^ and k^ are called correlates, and from the 
manner in which they were introduced it appears that the 
number of correlates is equal to the number of rigorous con- 
ditions to which the observed quantities are subject. To 
determine the values of the correlates let Xq-\-Vi, yo+^2) ^o+ ^s 
be substituted for x, y, z in the equations of condition, and 
the equations be developed by Taylor^s Formula, giving 

<^(aJj yy ») = <l>(xo, 3^o,«o) + ^ Vi + ^V2+^^8 + etc.=0. 

and a similar expression for ^(x, y, z). Let the values of 
Vi, V2, ^8 ill terms of ki and kz be substituted in these equations, 
and put 

([paaj)-^ = ai, — ([:pbbj)-i = ag, t" ([i^cc])-i = a,8 



K20) 



dx ^L^^-^-"^ ^'' dy ^Lx-'-j/ -•.. ^^ 

and the equations become 

[a/?]A:i + \_m^^ + 'A K 2/0, ^0) = ^ > ^ ^ 

from which the values of k^ and k^i may be obtained, and thus 
the values of Vi, V2, '^s from equations (19), which may now be 
put in the following more convenient form : 

yj \_'paa\ Vi = ttiA^i + ^yk<i, 
^ [ phlf^ V2 = a^ki + P2K 
^lpcc']vs = a^ki + fi^kz 

The method by which the above equations have been derived 
for the case of three unknown quantities connected by two 
equations of condition is perfectly general and may be ex- 
tended to any other number of quantities whose values are 
to be obtained from independent observations. In the cases 
which actually arise in practice the observation equations and 
equations of condition are usually of simple form, the differen- 
tial coefficients and the quantities a, b, c, etc., being usually 
equal to either 1 or 0. 
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Problem. Let the student show by the method of corre- 
lates that if the sum of the measured angles of a plane triangle 
exceed 180° by a quantity e, the angles must be corrected by 
distributing e among them in such a manner that the correction 
to each angle is inversely proportional to the w^eight of the 
angle. 

To illustrate the application of the principles of* the present 
section to a numerical problem, we select from the U. S. C. & 
O, Survey Report for 1884, pages 409 et seq., the following tele- 
graphic determinations of longitude, and seek to adjust them 
so that they shall be mutually consistent. Each difference of 
longitude between two stations was directly observed, so that 
the observation equations are all of the form x = mi, x — mg, 
etc., and the values given below are the weighted means of the 
individual observatiojis of each series. The f)robable error of 
each determination (see § 12) is placed immediately after the 
quantity itself, and the weights of the determinations are 
assumed to be inversely proportional to the squares of the 
probable errors. 







Observed 


1 

/ — 




stations. 


Symbol. 


Difference of 


1 






Longitude. 


VJP 


P 


Cambridge, Mass., > 
Washington, D.C., ) 


a"o 


23"^41?041±0?018 


0.18 


0.032 


Cambridge, Mass., ^ 
Cleveland, 0., j 


Vo 


42 14.875 ±0.038 


0.38 


.144 


Cambridge, Mass., ^ 
Columbus, 0., ) * 


«o 


47 27.713 + 0.035 


0.35 


.122 


Washington, D.C., ) 
Columbus, 0., ) 


Mo 


23 46.816 + 0.038 


0.38 


.144 


Cleveland, 0., ^ 
Columbus, 0., ) 


w^ 


5 12.929 ± 0.045 


0.45 


.202 



The five observed differences of longitude give rise to two 
rigorous conditions represented by the following equations of 
condition : 
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etc., 



<!>{), u+x — z = ^(), w;-|-y — 2 = 

The coefficients ia the observation equations being all equal to 

unity, 

[paa']=p^ lpbb']=p^, etc., 

J dd> 1 d<l> 1 a d\jf 1 

^ Vp, ^y -y/py ^ Vp. 

and from these expressions are derived the following values of 
the coefficients, together with the sums o-i = ai-h/?j, 0-2 = (12+^29 
etc., which are to be employed as a check upon the formation 
of the normal equations for determining the correlates. 

Coefficients. 



Subscripts. 


1. 

• 


2, 


3. 


4. 


5. 


a 
3 


+ 0.18 

0.00 

+ 0.18 


0.00 
+ 0,38 
+ 0.38 


-0.36 
-0.35 
-0.70 


+ 0.38 

0.00 

+ 0.38 


0.00 
+ 0.46 
+ 0.46 



Formation of the Correlate Equations. 



aa 


aP 


as 


PP 


Ps 


+ 0.0324 
.0000 
.1225 
.1444 
.0000 


+ 0.0000 
.0000 
.1226 
.0000 
.0000 


+ 0.0324 
.0000 
.2450 
.1444 
.0000 


+ 0.0000 

.1444 

.1225 

.0000 

'.2026 


0.0000 
.1444 
.2450 
.0000 
.2025 


+ 0.2993 
Check. 


+ 0.1226 


+ 0.4218 
0.4218 


+ 0.4694 


+ 0.5919 
0.5919 



Correlate Normal Equations. 

+ 0.2993 k, + 0.1225 Ajg + 0M44 = fci = 
-f 0,1226 ki + 0.4694 k^ + .091 = k^ = 



- 0'.449 

- .078 
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The absolute terms of the correlate equations are obtained 
by substituting the observed values a\„ y^ Zq, v^ Wq in the equa- 
tions of condition, and the values of AJj, k^ may be found from 
the correlate equations, either by Gauss's method of substitu- 
tion or by any t)f the ordinary algebraic processes of elimina- 
tion. The corrections to Xq, y^, jSq, etc., and the adopted values 
of the unknown quantities, are now found from 

-Wi = -f. 0.032 fci -h 0.000 fc^ = - 0'.014 x = 23™ 41*.027 

V2 =+ 0.000 feiH- 0.144 A;2=- 0.011 2^ = 42 14.864 

^3 = - 0.122 k^ - 0.122 fcg = + 0.064 2; = 47 27 .777 

-V4 = -h 0.144 fci -h 0.000 fej = - 0.065 w = 23 46 .751 

y^ = -h 0.000 \ + 0.202 A'2 = - 0.016 w = 5 12 .913 

The values thus obtained satisfy the rigorous conditions of 
the problem, and are the most probable values which can be 
obtained from the data given above. 

§ 12. The Probable Error. Every intelligent observer 
desires to know something of the quality of his observations, 
how good or how bad they are ; the computer who has to com- 
bine the results of different series of observations should have 
some knowledge of their relative accuracy in order to assign 
to each series its proper weight ; and the investigator engaged 
in a complicated series of experiments desires some criterion 
by which to estimate the relative errors of the several parts 
of his work, in order to properly apportion his care among 
them, giving the maximum attention where the greatest errors 
are to be feared. It is evident from the nature of the case 
that no absolute criterion of this kind can be furnished, since 
any series of observations may be affected with systematic 
errors which seriously impair the accuracy of its results but 
furnish no indication of their presence. Both observer and 
computer do, however, estimate the accuracy of observations 
by their agreement among themselves, and that within certain 
limits this procedure is correct follows from Gausses law of 
error. If we suppose a very long series of observations 
affected only by accidental errors, the values of the unknown 
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quantities obtained from the series will differ but little from 
the true values (if the series is infinitely long they will be 
the true values), the residuals which they furnish will be very 
nearly the errors of observation, and the value of h in the 
equation of the error curve will furnish a measure of the 
precision of the observations as well as a measure of the 
smallness of the residuals. On the other hand, if the student 
attempts to construct the error curve corresponding to any 
short series of residuals, e.g., those of § 10, he will find that 
while they give him some information in regard to the curve 
there will be much that is arbitrary in its actual construction, 
and that many curves can be drawn which will appear to fit 
the residuals equally well, i.e. the amount of data in this case 
is insufficient to determine more than a rough approximation 
to the measure of precision of the observations. If the obser- 
vations are affected with systematic errors, the residuals may 
be very different from the errors of the observations, and will 
then furnish no indication of their accuracy. 

It thus appears that any conclusions in regard to the ac- 
curacy of a given set of observations must be treated with 
caution if they are based solely on the residuals furnished by 
the observations. Such conclusions are, in fact, valid only 
within certain limits whose general nature is indicated above, 
but within these limits the information thus furnished may be 
of much value, and it is frequently employed for the purposes 
indicated at the beginning of this section. 

The measure of precision, h, seems to be indicated by its 
name as the appropriate means of expressing the average 
accuracy of a set of observations, but in practice it is not so 
used, another function of the residuals being found more con- 
venient. If in a very long series of observations the residuals 
be arranged in the order of their numerical magnitude (with- 
out regard to sign), that residual which occupies the middle 
place in the series will have as many residuals greater than it 
as there are less than it, and in any future series of observa- 
tions of the same degree of precision as that here considered, 
it will be an even chance that any given residual will be 
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greater than, or less than, the middle one above selected. This 
middle residual is usually denoted by r, and is rather inappro- 
priately called the probable error of the series, the adjective 
having reference to the equal probabilities of the occurrence of 
residuals (errors) greater than, or less than, r. 

It is apparent that the greater the precision of any set of 
observation, the smaller will be the corresponding probable 
error, but the exact relation which exists between h and r 
must be derived from the equation of the error curve. The 
symmetry of this curve with respect to the axis of y shows 
that the same law of distribution holds for both positive and 
negative errors, and that in a very long series of residuals the 
probable error r will occupy the middle place among the 
positive errors and among the negative errors considered sepa- 
rately, as well as among all the errors taken without regard 
to sign. Since we are concerned only with the numerical 
magnitude of r we may confine our attention to the positive 
residuals, and find the relation between r and h from that half 
of the error curve which lies to the right of the axis of y. 

Since the probable error is a residual, it must be represented 
by the abscissa of some point on the axis of x, and we may 
determine this point from the condition . that the ordinate 
drawn through it bisects the area of that half of the curve 
under consideration, since (from the relation between areas 
and the number of residuals of a given magnitude developed 
in § 4) this is the geometrical equivalent of the statement 
that the number of residuals greater than r is equal to the 
number less than r. By interpolation from the table in § 4, 
the value of the argument corresponding to A = 0.25 is found 
to be Jix =:hr = 0.477, whence the relation between the proba- 
ble error and the measure of precision is 

r = MII (22) 

h 

The student will observe that in the definition of the proba- 
ble error reference is made to a very long series of observations, 
and in a series of infinite length the value of r might be found 
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immediately from its definition, but in any ordinary set of 
observations it is better to assume that the residuals are dis- 
tributed in accordance with the law of error, and to determine 
the value of r from the relation between h and the sum of the 
squares of the residuals, § 5, which gives 



= ± 0.477 V2 J^ 

\ n 



We here encounter a difficulty arising from the attempt to apply 
to a short series of residuals principles which are rigorously true 
only when the series is of infinite length. Suppose the above 
expression for r applied to a series of three observations involv- 
ing three unknown quantities whose values are derived from 
the resulting observation equations. These values will exactly 
satisfy the equations, no matter what the errors of the obser- 
vations may be, and the residuals being all zero, there will be 
found r = and h= oo, which is absurd. The observations in 
this case furnish no data from which to estimate their pre- 
cision, and in every such case where the number of observa- 
tions is equal to the number of unknown quantities, the 
expression for the probable error ought to become indeter- 
minate, -. It is therefore customary to put 

r = ± 0.674 yJ5^ (23) 

in which ft denotes the number of quantities whose values have 
been derived from the observations. This equation, which is 
known as Bessel's expression for the probable error of a sin- 
gle observation, being only an approximate one, we may usually 
put f in place of the coefficient 0.674. Among German physi- 
cists and astronomers, it is quite customary to suppress this 
coefficient altogether, and to use the "mean error" 



\7l — /U. 



(24) 



for the comparison of observations. Geometrically considered, 
c denotes the abscissa of the point of inflexion of the error 
curve. 
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A simpler expression for the probable error may be obtained 
by substituting in the equation 

0.477 
r = 

k 

a value of h derived as follows : Let each member of the equa- 
tion of the error curve be multiplied by xdx and integrated 
between the limits — oo and +00, giving 

The value of the first integral in this equation is obviously 0, 
since as we pass along the error curve from — oo to +00 every 
value of y occurs once associated with a negative value of x, 
and again with a numerically equal positive value, and for 
every negative element xydx in the integral there occurs an 
equal positive xydx so that the entire sum is 0. If, however, 
we agree to neglect the sign of x and to consider only its 
numerical value, we shall find 



I xydx = 2 I Qcydx 

•/ -co ' »/0 



and by a course of reasoning precisely similar to that applied 

X+00 /•+• 

arydx, it may be shown that 2 I xydx 
00 »/o 

is equal to the mean of all the residuals taken without regard 
to sign. We may therefore write 

where the + inside the brackets denotes that all of the resid- 
uals are to be treated as positive quantities. Putting kx = t 
in the second member of this equation and remembering 
that here also we are concerned only with numerical values 
of X without regard to sign, we obtain 

n h^^Jo ^VttV 2 Jo 

Introducing the limits into the integrated expression there 
results r+'y] 1 
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and 



= 0.477 V; t±^ 



n 



(25) 



This formula is rigorously correct only when the number of 
observations is infinite, and it must be transformed so as to 
become indeterminate when the number of observation equa- 
tions is just sufficient to determine the unknown quantities, 
i.e., when n = fji. This might be accomplished by writing 
71 — /n in place of n, as was done in equation (19), but it is 
customary to substitute in this case -Vn^n — fi), which also 
renders r indeterminate when n = fi, and gives values of r more 
nearly in agreement with equation (23). Making this substi- 
tution, we have 

r=± 0.845 _I±^L (26) 

•\/n(n — fx) 

which, when ^=1, is known as Feters^ formula for probable er- 
rors. This formula is very convenient for the numerical compu- 
tation of probable errors, but where the number of observations 
is small the results furnished by equation (23) are considered 
more reliable, but neither formula can furnish a good determina- 
tion of probable errors from a small number of observations. 
The numerical application of these formulae may be illus- 
trated by the following short series of sextant observations for 
the determination of latitude. 



Mean 

n 



Observations 


V 


vv 






43<^4U6" 


19" 


361 


log [+ v] 


2.403 


4 24 


3 


9 


a.c.,log Vn(?i — 1) 


8.940-10 


4 7 


20 


400 


log 0.845 


9.926-10 


4 28 


1 


1 


logr 


1.269 


4 69 


32 


1024 


^ 


± 18".6 


4 39 


12 


144 






4 52 


25 


625 


log [w] 


3.886 


4 52 


25 


625 


log(w-l) 


1041 


3 47 


40 


1600 


log ["":!, 

w — 1 


2.845 


4 15 


.12 


144 


^n — 1 


1.422 


3 36 


51 


2601 


log 0.674 


9.829 - 10 


4 40 


13 


169 


logr 


1.251 


= 43 4 27 [4 


-i?]=253 


[tw] = 7703 


r 


± 17".8 



12 u=l 
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The difference between the values of r found from the first 
and second powers of the residuals is small compared with the 
uncertainty of each arising from the small number of obser- 
vations. 

In so far as these observations can be considered as furnish- 
ing a value of r, they indicate that in a future series of similar 
and equally precise observations, there should be as many 
observations furnishing residuals (errors) greater than 18" as 
there are observations giving residuals less than 18". The ± 
which is commonly prefixed to the numerical value of r, denotes 
that the observed quantity is as apt to err in excess as in 
defect. 

Let the student derive from the residuals given in § 10 a 
determination of the probable error of an observed V, noting 
that in this case /x = 3. 

§ 13. Probable Error of a Fimction of Observed Quantities. 

Let a?', 05", aj'" denote quantities which have^ been determined 
from observation, and let r', r", r'" be their probable errors. 
Let w be a quantity whose value has been computed from the 
values of «', a?", a;'" by means of the relation 

-• i^=/(a;',a;", aj'") 

It is evident that the precision with which u is determined, 
depends upon the precision of aj', «", a;'", and by a slight exten- 
sion of the term "probable error" we may consider the pre- 
cision of u to be represented by a probable error, r, and may 
inquire the relation of r to r\ r", r'". 

Since a probable error is one of the residuals or errors of a 
very long series, we may obtain the desired relation between 
r, r', r", r'" from a consideration of the general relation of any 
set of errors v\ v", v"', in x\ a?", a?'", to the corresponding 
error, v, in u. This relation is 

du , . du „ , du ,„ . . 

(see § 8) . To avoid the necessity for considering the signs of 
v', v", v'", let this equation be squared, giving 
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v^^f^^v+(^yv'^^^(p^^ 



dx7 ^ \dx'' 



from which all terms involving the products v'v'^f v'-v'", v"v'^', 
etc., have been dropped for the reason that the probable error 
of u depends upon the average magnitude of v, and in the long 
run any pair of residuals v^, v'^ will have opposite signs as 
often as they have like signs, and will therefore produce an 
equal number of positive and negative terms whose effect upon 
the mean value of 'v^ will be very small compared with the 
terms containing v'*, v"*, v"", which are always positive. Re- 
placing these . actual errors by the corresponding probable 
errors, we obtain 

and an equation of similar form will express the relation of 
the probable error of the function to the probable errors of the 
quantities upon jivhich it depends, whatever the number of 
these quantities may be. 

We proceed to apply this relation to a few simple cases of 
frequent occurrence in practice. 

(a) The probable error of the sum of n observed quantities. 
In this case u = x' + x" -\- x'" -|- • • • + a;* 

and each of the differential coefficients — , -^, etc., equals 1; 

dx' dx*' 

whence r^ = r*^ -{-r"^ + r"'^ '\- -- +7^^ (28) 

(6) The probable error of the mean of n observed quantities. 

In this case u = -(x^ + x" -+- x'" + •••+«") 

n 

du__ du _ . _ 1 

dx^ dx" ' n 

r^ = 1 (r'2 -f r"2 -f r'"^ + ...^ 
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We have here to distinguish two cases. If the x^9 are all of 
equal precision, the ^•'s are equal, and may be represented by a 
common symbol rj ; whence 



r = ±J^=±J^ (29) 

If the observations are of unequal precision represented by 
weights p', p'\ ^'''--jp** 



etc. 



, p'x' -\-p"x" + »'"«"' 4- •••»*«* 
we have u=- -^ ^Ar — 

du _ p' du __ p'* 

dx' ■" [p] dx" "■ [p] 

r = ±-^ (30) 

V|>] 

where rj denotes the probable error of an observation whose 
weight is 1. 

The relations here derived between the probable error of a 
single determination of a quantity, and the probable error of 
the mean of n determinations, may be employed in connection 
with equation (23), to determine the - probable error of an 
adopted value based upon several determinations of a quantity. 
Thus, in the general case of observations of unequal weight, if 
Ti represent the probable error of an observation of weight 1, 
and r the probable error of the weighted mean, we have from 
equation (23), 

n = 0.674 vpH ^ 
\n — 1 



T = 



> (31) 



Let the student show that when the observations are of 
equal precision and 
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u = a(x^ — x" — x"') r=±aV3r' 

. x' . aj' r' 

u = sin— r=±cos— .— 

a a a 

§ 14. Assignment of Weights. Bejection of Observations. 

The term weight has been employed in the preceding sections 
as a measure of the quality of an observation, but its use is by 
no means limited to the case of single observations. Thus, 
from an Investigation of the Distance of the Sun^ etc., by S, 
Newcomh, we select the following determinations of the solar 
parallax. 

Metbod by which determined. 

Meridian observations of Mars, 1862. 
Micrometric observations of Mars, 1862. 
Parallactic inequality of the Moon. 
Lunar equation of the Earth. 
Transit of Venus, 1769. 

Each value of the parallax here given is the final result of 
an elaborate discussion of many observations, and the weights 
indicate the relative excellence attributed to these results by 
the author of the investigation. If it denote any one of these 
values of the parallax, p its weight, and ttq the most probable 
value of the parallax, we shall have 

^^ = l^ = 8".847 (§ 6) 

. It is to be noted that this value depends upon the weights 
assigned to the individual determinations, and that by prop- 
erly selecting the weights, ttq may be made to assume any value 
whatever between the least and the' greatest single determina- 
tion. Thus if the weight 100 be assigned to the value 8 ".860, 
and to each of the other values the weight 1, we shall find 
TTo = 8".859, whUe a weight 100 for the value 8".809 with a 
weight 1 for each of the others, makes ttq = 8 ".811. Between 
these limits the value of ttq depends upon the judgment of the 



Parallax. 


Weight. 


8".855 


25 


8.842 


6 


8.838 


16 


8.809 


3 


8.860 


6 
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computer in assigning weights, and this determination of 
weights is one of the most delicate questions that arise in the 
application of the method of least squares. 

A relation between weights and probable errors may easily 
be established, which is frequently of service in that it enables 
the problem of weights to be stated in a different form. Let x 
denote an observation whose probable error is r and whose 
weight is 1, and denote by x', a?", a?'", etc., observations or com- 
binations of observations of the same quantity, whose weights 
and probable errors are represented by p', p", ^"', r', r", r'", 
etc. In accordance with the definition of weights, x' is the 
equivalent of p' observations of the same quality as x, and 
from the equations derived in the preceding section we have 



Vp' 

with a similar expression for each of the other quantities: 
whence 

and p' = ^, p" = f^ p"' = ^„ (32) 

and, in general, the weights are inversely proportional to the 
squares of the probable errors. 

It has been sufficiently shown that probable errors derived 
from the residuals furnished by a series of observations repre- 
sent only the effects of accidental errors of observation, but we 
may extend the significance of the term so as to include an 
estimate of the effect upon x'y 05", x'" of systematic errors in 
the observations. Let Vi and r^ represent those parts of the 
probable error which come from these two sources respectively, 
and from § 13 we find for their combined effect 

and the expression for the weights becomes 

P^^i^-n-- (33) 
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By this device the determination of weights is reduced to an 
estimate of the combined effect of accidental and systematic 
errors of observation upon the quantity whose weight is 
desired, and it was from an estimate of this character that the 
weights of the parallaxes given above were derived. 

If r' denote the probable accidental error of a single obser- 
vation, and the quantity whose weight is p is the mean of n 
such observations, we shall have 

P = p^ (34) 

— +r/ 
n 

from which all constant factors have been dropped, since 
only relative values of p are ever required. It appears from 
this equation that if the systematic errors, rg, are very small 
compared with the accidental errors, ?•', the weight increases 
rapidly as the number of observations is increased, but if the 
systematic errors are large, the weight is but little affected by 
the number of observations; a relation to be considered in 
deciding how many observations shall be made to determine 
an unknown quantity. 

In some cases it may be impossible to form any reliable 
estimate of the effect of systematic errors, and results which 
have been derived by different methods, or under different cir- 
cumstances, may then be given equal weights on the supposition 
that they are affected by different systematic errors which it 
is equally important to eliminate; but this is equivalent to 
putting ?'2 = <3o in the equation for the weights, and it will 
rarely happen that this is the best estimate which can be made 
for the amount of the systematic errors. 

It frequently happens that in a series of otherwise accordant 
observations, one or two will be found which differ widely from 
the others, and which if included in the final result, will 
furnish large residuals. What shall be done with observations 
of this kind has long been a vexed question. To reject them 
is equivalent to assigning to them the weight 0, and is the 
expression of the computer's judgment that they can contribute 
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nothing to the accuracy of the result which he seeks to obtain. 
In an infinitely long series of observations, errors of any finite 
magnitude may be permitted without impairing the accuracy 
of the final result, and the existence of such errors seems con- 
templated by the theory which we have adopted, since the 
equation of the error curve gives finite values of y for all 
values of x between the limits — oo and + oo . But in the 
actual case which arises in practice where a result must be 
obtained from a comparatively small number of observations, 
a single one of these, if affected with a large error, may make 
the final result farther from the truth than any one of the 
other observations. On the other hand, cases are by no means 
unknown iu which a single discordant observation in a series 
proves to be nearest to the true value of the quantity sought, 
the others having all been vitiated by some common cause ; 
and between these extremes an infinite variety of cases may 
be found. It must in general remain a matter of doubt 
whether a given discordant observation should or should not 
be rejected, and the decision made by the computer must be 
his judgment based upon all the data available as to whether 
more will be gained by rejecting than by retaining it. A knowl- 
edge of the way in which observations are made, of the circum- 
stances attending the particular observation in question, the 
magnitude of the errors which may reasonably be expected 
with the given observer and apparatus, or instrument, are. 
elements which should be included in this judgment ; and the 
observer will greatly facilitate its formation by making copious 
notes at the time of observation of all circumstances which in 
his opinion may affect the quality of his work, and particularly 
by noting any abnormal circumstances affecting a single obser- 
vation or a part of the observations. 

A doubtful observation should be rejected if it is the com- 
puter's deliberate judgment that its retention will hurt more 
than it will help his final result, but it is never legitimate for 
the computer to suppress an observation. A rejected observa^ 
tion should be included in the statement of his data, and may 
properly be accompanied by an explanation of the reasons for 
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its rejection, in order that any person interested in the result 
may form his own judgment of the data and the manner in 
which they have been discussed, and may, if necessary, re- 
discuss the observations in accordance with that judgment. 

The conclusion of the whole matter of assigning weights to 
numerical data may be summed up in the statement that no 
mathematical expression will suffice for this purpose, but the 
weights must be determined by an exercise of personal judg- 
ment, and the wider the knowledge upon which this judgment 
is based, the greater confidence will the weights and the result- 
ing values of the unknown quantities command. 

§ 15. Empirical or Interpolation FormolsB. In the preceding 
sections attention has been directed to that class of problems 
in which the theoretical relation between the observed quanti- 
ties and those whose valued are to be determined is known ; 
that is, an equation of known form exists between them, and 
the problem has been to determine the values of the constants 
which appear in the equation. But a very different class of 
cases now demands a passing notice. 

A series of observations is sometimes found to be affected 
with errors too great to be explained as the result of unavoid- 
able and fortuitous causes, and it becomes apparent that the 
law of recurrence of these errors must be determined before 
the observations can be made to yield any valuable results. 
The American parties which were sent out in 1874 to observe 
the transit of Venus were provided with instruments for the 
determination of their local time, of such a character that the 
accidental error of a determination from a single star might 
fairly be estimated at 0".05 or 0'.06, but results obtained from 
observations of different stars varied among themselves by 
more than ten times this amount. An inspection of the dis- 
crepancies having shown that they depended in some way upon 
the distance of the observed star from the zenith, it was found 
by trial that the error at any zenith distance, », could be repre- 
sented by the expression 

E = ± \acosz — bsin2z\ 
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where a and b are constants whose values were found from the 
observations themselves. The physical cause of these errors 
was subsequently foxind to be the bending of the instrument 
under its own weight, but it is to be noted that the above law 
of recurrence of the errors was determined first, the cause 
afterwards. 

Expressions of this kind are sometimes called interpolation 
formulce and sometimes empirical equations; the one term hav- 
ing reference to their use, the other to their derivation. They 
are of very general use in all branches of physical science, 
since they may be made to serve as a convenient summary of a 
vast amount of numerical data, and one of the most important 
applications of the method of least squares is in determining 
the values of the constants which enter into such expressions. 
The problems treated in §§ 1 and 10 both belong to this class, 
and the following expression for the magnetic declination at 
Washington, D.C., derived by Mr, C. A, Schott * from a series 
of observations extending over ninety years may serve as a 
further illustration : 

Mag. Dec. = 2°.47 + 2^50 sin [1°.40(T-1850) - 14^6] 

where T denotes the year for which the declination is required. 
When the cause whose effects are to be represented by an 
equation is known, the form of this equation can usually be 
derived by mathematical analysis ; but where empirical formulae 
are employed other methods must be resorted to. The simplest 
of these is a graphical representation of the errors or other 
data under consideration. For this purpose let the errors 
represent ordinates, and the values of any variable upon which 
they are supposed to depend, the corresponding abscissas. Let 
points be plotted with these ordinates and abscissas as was 
done in obtaining the form of the error curve. Figs. A, B, C, D, 
and let a smooth curve be drawn through these points either 
free-hand or by the aid of a draughtsman's " irregular curve." 
The distance of each plotted point from the curve, mea,sured 
along an ordinate, is the residual corresponding to the point, 



* U. S, a & G. 8. Report, 1882, p. 258. 
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and in accordance with the principle of least squares the curve 
should be so drawn as to make the sum of the squares of these 
residuals as small as possible, without unduly complicating the 
curve. If the variable has been properly chosen it will in 
many cases be found possible to draw a simple curve which 
shall represent the data within the limits of the accidental 
errors of observation, and as this curve is the graphical rep- 
resentation of the law required, its equation, y=f(x), is the 
analytical representation of that law. In this manner the 
form of the equation treated in § 10 was obtained. 

In other cases it will be possible to draw a smooth and sim- 
ple curve which shall not represent the data within the limits 
of accidental error of the observations, but about which the 
points will be grouped, alternating from one side to the other 
in a systematic manner. Let the excess of the ordinate of any 
point over the corresponding ordinate of the curve be plotted 
with the given abscissa in a new curve. The two curves thus 
constructed will together form the graphical representation of 
the law of the data, and" the analytic expression of that law 
will be ^/ \ , J / \ 

if y =f{x) and y = <l>(x) are the equations of the two curves 
respectively. 

In some cases the curves themselves will be a sufficient 
representation of the data, and it will be unnecessary to deter- 
mine their equations since the value of y corresponding to any 
given value of x may be obtained by direct measurement. In 
other cases the curve will be chiefly serviceable in suggesting 
the probable form of an equation between the observed quan- 
tity and a variable upon which it is supposed to depend, or in 
showing that no simple relation exists between them. Two 
forms of equation are of such frequent use in this connection 
that they deserve especial notice. 

If the plotted curve does not differ very greatly from a 
straight line, the relation of the variable x to its function y 
may be represented by 

y = a -h 6a5 -h co;^ -h doi^ -|- etc. (35) 
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This equation contains the first few terms of an infinite 
series by which a limited arc of any continuous curve can be 
represented, and since the actual relation between y and x 
could be represented by a curve, if its mathematical expression 
were known, it follows that the above equation can be made to 
represent this relation over a certain range of values of x, by 
assigning proper values to the coefficients. The number of 
terms of this series which should be taken into account, and 
the limits of x beyond which the equation is not applicable, 
depend upon the actual relation between y and x, and are 
therefore unknown ; but, in general, it is not well to attempt to 
use this equation for large values of a?, or when more than 
three or at most four terms are required. Its application in a 
simple case is illustrated in the problem of § 1, where y and x 
being replaced by the length of the bar and its temperature, it 
is assumed that their relation can be expressed within the 
range of temperature over which the observations extend, by 
the first two terms of the series. 

The second type of equation above referred to is 



X 2x 3x ^ 

y = Oo-haicos — \-a2G0s f-ctacos f-etc. 

m m m 



-h hi sin — f- 02 sin f- 63 sin 1- etc. 

mm m 



> (36) 



y 



in which m is an undetermined constant expressed in the same 

unit as x] — is therefore a ratio, or absolute number, which 
m 

in the application of the equation to numerical data must 

be transformed into circular measure by multiplying it by 

180° 

= 57°.29578. This form of equation may be made to 

IT 

represent any relation whatever between finite values of y and 
a;, including those cases in which y is a discontinuous function, 
but it is especially advantageous when y is a periodic function 
of a?, i.e., one in which the same values of y recur for values of 
X, separated by a constant interval, t, called the period of Xy 
so that 
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/(ic)=/(aj + T)=/(a; + 2r) = ...=/(aj + nr). 

The simplest type of such a function is y = sin a?, the period 
in this case being t = 360° ; e.g., 

sin 10° = sin (10° + 360°) = sin (10° + 720°) = etc. 

When y is such a function the constant m should be put equal 

to the period divided by 27r, m = ^; in other cases the value 

27r 

/v. 

of m should be so taken that the greatest value of — included 

m 

among the data shall not exceed tt. The application of this 

formula may often be facilitated by noting that if the relation 

between y and x is such that /( + «) =/(—«), the sine terms 

ail disappear, and the equation reduces to 

X 2x 
y = aQ + Oicos — h Oacos f- etc. (37) 

m m 

while if f{x) = —/(—«), the cosine terms vanish, and the 
equation becomes 

v = &o + ^sin- + 62sin— -f etc. (38). 

mm 

The several forms above given to this type of equation are 
those most convenient for use when the values of the coeffi- 
cients a, by etc., are to be determined, but after their numerical 
values have been found it is advantageous to transform the 
equation as follows : Introduce the auxiliary quantities 

rio, rii, Wg, ^1, ^2? etc* 

defined by the relations 

rio = Oo Wi cos -^1 = Qi nj cos N2 = a^ 

Til sin ^1 = bi ria sin N2 = 6j 

and the equation becomes 

V = Wo + Wi cos ( — — Ni] -h rigcos I — — ^2 ) + etc. (39) 

\m ) \m ) 

each pair of terms of the original equation being here replaced 
by a single term. The expression for the magnetic declination 
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at Washington given on p. 59 is of this type, as may be seen 
by writing it in the equivalent form 

Mag. Dec. = 2°.47 + 2^50 cos [r .40 (T-1850) - 104°.6] 

The mode of applying this form of equation may be illus- 
trated by means of the following data selected from the series 
of observations whose residuals are plotted in Fig. C. The 
observed quantity, B, is the difference of stellar magnitude 
(brightness) between the planet Saturn and his satellite lape- 
tus. The quantity I given with each observed B fixes the 
position of the satellite in its orbit at the time of observation, 
and is analogous to the variable angle in a system of polar 
coordinates. 



I. 
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Residual. 


I. 


B, 


Residual. 




m 


m 




m 




10° 


10.82 


-0.28 


200° 


10.66 


+ 0.28 
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11.81 


+ 0.22 


230 


9.87 


-0.21 


110 


11.69 


-0.02 


270 


10.43 


+ 0.21 


140 


11.42 


-0.03 


310 


10.48 


-0.16 



B is here seen to run through a complete cycle of values 
between the limits 9.87 and 11.81, while I varies from 0° to 
360°. We shall therefore endeavor to represent J5 as a periodic 
function of I whose period is 360°. In accordance with this 
assumption we put 

X I 180° , 

m = 360^--^ = ^^^ = ^ 

2'jr 

and taking into account the first five terms of the series, the 
several observations furnish the following 

Observation Equations. 

10.82 = Oo -f- 0.98 ai + 0.17 6i + 0.94 a^ + 0.34 6, 
11.81 = ao + 0.34 Oi 4- 0.94 6i - 0.77 eta 4- 0.6462 
11.69 = Oo - 0.34 Oi + 0.94 h^ - 0.77 a, - 0.64 63 
11.42 = Oo - 0.77 ai + 0.64 61 + 0.17 a^ - 0.98 h^ 
10.66 = Oo - 0.94 Oi - 0.34 61 4- 0.77 Oj -f- 0.64 &, 



N 
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9.87 = Oo — 0.64 ai - 0.77 b^ - 0.17 Oj + 0.98 &2 

10.43 = Oo -f- 0.00 tti - 1.00 bi - 1.00 ag -f 0.00 b^ 

. 10.48 = 00 + 0.64ai - 0.77 6i - 0.17 a^ - 0.98 63 

The solution of these equations will be found in the follow- 
ing section. The values obtained for the constants are 

00 = 10.92, Oi = + 0.15, 61 = + 0.74, 02 = -0.04, 62 = -0.18 

Introducing the constants n, -^, and determining their values 
from the relations 

riQ = 10.92 Wj cos Ni = -^ 0.15 ng cos ^2 = — 0.04 

Wi sin Ni = -\- 0.74 712 sin iV^g = — 0.18 

the equation becomes 

B = 10.92 + 0.76 cos (I - 78°.5) - 0.18 cos (2 1 - IT, 6) 

The residuals obtained by comparing the values of B computed 
from this formula with the observed values are given above 
with the data. 

Abundant data for exercise in deriving empirical formulae of 
this kind may be found in the United States Coast and Geodetic 
Survey Report for 1882, pp. 218-257. 

§ 16. Approximate Solutions. It is often desirable to obtain 
from a series of observations, as rapidly and with as little labor 
as possible, a set of values of the unknown quantities involved 
which shall be fair approximations to their most probable val- 
ues, but in which the highest degree of accuracy is not required. 
In cases of this kind the least square treatment of the obser- 
vation equations as illustrated in § 10 is too long and laborious, 
and the following method may often be substituted for it with 
advantage. 

Let there be any number, e.g., three, unknown quantities in- 
volved in a set of observation equations of the form 

ax -\- by -\- cz -\- n =^ Weight =p 

and let each of these equations be multiplied by its weight, 
giving the group of equations, 
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aiX + 61^ + CiZ -j- ?ii = ki 
a^ + bzy + c<^ + n2 = k^ )- (40) 

etc. etc. etc. etc. 

Multiply each of these equations by the undetermined con- 
stant k placed opposite it, and let the sum of all the result- 
ing equations be formed. By the use of the summation 
symbol, [ ], this sum may be written 

\ka\x + \kh']y + \_kc]z -{-[kn] = 

Since the several values of k which enter into this equation 
are entirely arbitrary it would be permissible to assign to them 
such values that [A:&] and [Arc] should each equal 0, which 
would give at once 

This, however, is not practically advantageous on account of 

the labor involved in determining the values of k. We there- 
fore put 

___ __ [fcn] __ [kb'] _ [kc] . ,2\ 

^"" [to] [tol^ [Mf ^ ^ 

and, limiting the values of fc to -|- 1 and — 1, assign them in 
such a manner that [ka] shall be made as great, and [kb'], 
[kc] as small as possible. In this manner the coefficients of 
y and z may often be made so small that if approximate values 
of y and z are substituted in equation (42), they will furnish 
a sufficiently accurate value of x ; since the effect of the errors 
of these approximations will be much diminished by the small 
coefficients by which they are multiplied. 

The value of y may be found in the same manner by select- 
ing a set of A;'s which shall make [kb] large and [ka]j [kc] 
small, and similarly for z. Two or three trials may be required 
before sufficiently close approximations to the values of x, y, z 
are obtained, but these trials are rapidly and easily made, and, 
if necessary, in exceptional cases the summation equations 
may be written in the form 
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[k'a ^x+lk'b ]y + [A;'c >+[fc'n ] = 
[A;"a>+[A;"6]y+[A:"c> + [A:%] = J> (43) 
[k"' a']x + [k"' 6]y 4- [A;'" c]2; -f [k'" w] = 

and the equations solved by any of the methods of elementary 
algebra, but in every case the values of /c, 4-1 and —1, should 
be so chosen that in the first equation the coefficient of x, in 
the second equation the coefficient of y, and in the third equa- 
tion the coefficient of z, shall be made as large, and all of the 
other coefficients as small as possible. 

By this mode of solution each observation with its proper 
weight is included in the determination of the unknowns, but 
since the principle of least squares has not been taken into 
account, it cannot be expected that the resulting values will 
be the best that the observations can be made to yield. 

To illustrate the mode of solution we recur to the observa- 
tion equations contained in the preceding section and putting 
Oo = 10.00 -+- a write them as follows, placing the several val- 
ues of k at the right of each equation. 



cos I sin I cos 2 2 sin2Z 

0.82=a-f0.98ai+0.176i-f0.94a2-f0.3462 
1.81=a+0.34ai+0.94 61-0.77 aa-fO.64 62 
1.69=a-0.34ai 4-0.94 61-0.77 02-0.646, 
1.42=a-0.77ai4-0.646i4-0.17 02-0.98 6a 
0.66=a-0.94oi-0.346i4-0.77o24-0.646s 
-.0.13=a-0.64ai-0.776i-0.17o2 4-0.98 62 
0.43=a4-0.00ai-1.00 61-1.00 024-0.0062 
0.48=a4-0.64oi-0.776i-0.17o2-0.9862 

The summation equations obtained from this group are : 
4- 7.18 = 8a - 0.73ai - 0.19 61 - 1.00 o, 4- 0.00 62 '^ 

- 0.10 = Oa 4- 4.65 Oi - 1.13 61 - 1.00 o, 4- 0.00 63 

4-4.30 = 0a4-1.15oi4-5.576i4-0.14o2 4-0.0062 > (44) 

- 0.42 = a 4- 0.53 Oi - 0.41 61 4- 4.42 Oj 4- 0.00 62 

- 0.86 = Oa 4- 0.21 Oi 4- 0.19 6, 4- 2.54 o, 4- 5.20 6, > 



k' k" k'" ifci^ k^ 

4-1-H 4-1+1+1 
+1+1+1-1+1 
4.1_14.1-.1_1 

+1-1+1+1-1 
4.1--1-1^1^1 

4.1-1-.1_1+1 

4.14.1_1«1-.1 

4.14.1_14.1_1 
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and these correspond to the normal equations of a least square 
solution. To apply the method of approximations to the solu- 
tion of this group of equations, we write them in the form : 

a = + 0.897 -h 0.091 ai + 0.024 b^ + 0.125 a^ ^ 

Oi = - 0.022 + 0.000 Oi -h 0.243 6i + 2VjCj 

6^ = 4. 0.772 - 0.206 a^ + 0.000 b^ - 0.025 ag > (45) 

Os = - 0.095 - 0.120 Oi -f 0.093 bi + 0.000 a, 

^2 = - 0.166 - 0.040 Oi - 0.036 6i - 0.488 Oj 

The. divisions required in making this transformation were 
made by the use of Crelle's Rechentafeln. 

By operations which can be performed mentally, we obtain 
the following sets of approximations to the values of 



I. 



II. 



III. 



<h 


0.0 


+ 0.2 


+ 0.15 


h 


+ 0.8 


+ 0.7 


+ 0.74 


a. 


-0.1 


-0.0 


-0.04 



and substituting in equations (45) the values given under in 
we find the adopted values 

ao = 10 + a = 10.92 ai=+0.15 a2=-0.04 

6i = + 0.74 6j = - 0.18 

which were employed in the preceding section. 
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